Скачать презентацию Management of Missing Data in Clinical Trials from Скачать презентацию Management of Missing Data in Clinical Trials from

16f9a5c39f7a077b5ca31b3d4c099865.ppt

  • Количество слайдов: 66

Management of Missing Data in Clinical Trials from a Regulatory Perspective H. M. James Management of Missing Data in Clinical Trials from a Regulatory Perspective H. M. James Hung Div. of Biometrics I, OB/OPa. SS/CDER/FDA Presented in FDA/Industry Workshop, Bethesda, Maryland, September 23, 2004 James Hung, 2004 FDA/Industry

Collaborators Charles Anello, Yeh-Fong Chen, Kun Jin, Fanhui Kong, Kooros Mahjoob, Robert O’Neill, Ohid Collaborators Charles Anello, Yeh-Fong Chen, Kun Jin, Fanhui Kong, Kooros Mahjoob, Robert O’Neill, Ohid Siddiqui Office of Biostatistics, OPa. SS, CDER Food and Drug Administration James Hung, 2004 FDA/Industry

Disclaimer The views expressed in this presentation are not necessarily of the U. S. Disclaimer The views expressed in this presentation are not necessarily of the U. S. Food and Drug Administration. Acknowledgment O’Neill (2003, 2004) Temple (1994 -2004) James Hung, 2004 FDA/Industry

Outline • Informative dropout • Statistical analysis methods • Methodology consideration • Summary James Outline • Informative dropout • Statistical analysis methods • Methodology consideration • Summary James Hung, 2004 FDA/Industry

Clinical trial focuses on intent-to-treat population (including completers and dropouts) Response variables often measured Clinical trial focuses on intent-to-treat population (including completers and dropouts) Response variables often measured over time (e. g. , at multiple clinic or hospital visits) James Hung, 2004 FDA/Industry

Often the main clinical hypothesis concerns the effect K of a test drug r. Often the main clinical hypothesis concerns the effect K of a test drug r. t. a control at some time K (e. g. , end of study). Statistical null hypothesis H 0: K = 0 i. e. , allow nonzero at other time points? (make sense? ) James Hung, 2004 FDA/Industry

Unclear why testing only at the last time point is most relevant (for simplicity? Unclear why testing only at the last time point is most relevant (for simplicity? avoid statistical adjustment for testing multiple times? ) Drug effects over time are important information. e. g. , inconceivable to market a drug that is effective only at Week 6, say. James Hung, 2004 FDA/Industry

For drug effect over time (or some period of time, e. g. , at For drug effect over time (or some period of time, e. g. , at steady state), the relevant null hypothesis is H 0: 1 = ∙∙∙ = K = 0 or H 0: slope difference = 0 (if response follows straight-line model ) or others for relevant time period. James Hung, 2004 FDA/Industry

Informative Dropout In many disease areas, dropout rate is high and the results of Informative Dropout In many disease areas, dropout rate is high and the results of any analyses for ITT population is not interpretable because of a large amount of missing data, particularly when dropouts are ‘informative’. James Hung, 2004 FDA/Industry

Dropout problems are multi-dimensional e. g. , dropping out due to multiple reasons: side Dropout problems are multi-dimensional e. g. , dropping out due to multiple reasons: side effects of the drug, health state is worsening, unperceived benefit Little knowledge of real causes of missing data, whether missing mechanism related to study outcome or treatment James Hung, 2004 FDA/Industry

Informative dropout has many different definitions, e. g. , - dependent on observed data, Informative dropout has many different definitions, e. g. , - dependent on observed data, dependent on missing data, treatment-related dropout, … - tied in with missing mechanism MCAR, MNAR, NIM, … O’Neill (2003, 2004) James Hung, 2004 FDA/Industry

For regulatory consideration, any treatment related dropout may be a suspect of informative dropout For regulatory consideration, any treatment related dropout may be a suspect of informative dropout and missing mechanism probably needs to be considered informative (i. e. , may severely bias estimates and tests) unless proven otherwise. James Hung, 2004 FDA/Industry

In a clinical trial, each cohort of dropout by reason or by dropout time In a clinical trial, each cohort of dropout by reason or by dropout time can be very small. Difficult or impossible to assess whether missing values are informative. James Hung, 2004 FDA/Industry

James Hung, 2004 FDA/Industry James Hung, 2004 FDA/Industry

James Hung, 2004 FDA/Industry James Hung, 2004 FDA/Industry

Based on visual inspection, drug seems to perform better than Placebo. James Hung, 2004 Based on visual inspection, drug seems to perform better than Placebo. James Hung, 2004 FDA/Industry

James Hung, 2004 FDA/Industry James Hung, 2004 FDA/Industry

James Hung, 2004 FDA/Industry James Hung, 2004 FDA/Industry

James Hung, 2004 FDA/Industry James Hung, 2004 FDA/Industry

James Hung, 2004 FDA/Industry James Hung, 2004 FDA/Industry

Difficult to tell whether missing mechanism is ‘ignorable’ or not… e. g. , in Difficult to tell whether missing mechanism is ‘ignorable’ or not… e. g. , in a linear response profile, MAR May be NIM. James Hung, 2004 FDA/Industry

James Hung, 2004 FDA/Industry James Hung, 2004 FDA/Industry

James Hung, 2004 FDA/Industry James Hung, 2004 FDA/Industry

James Hung, 2004 FDA/Industry James Hung, 2004 FDA/Industry

These plots show difficulty in classifying dropouts (informative or not) in individual trials where These plots show difficulty in classifying dropouts (informative or not) in individual trials where each cohort of dropout is small, (though total dropout rate could be high). These types of analysis should be done with external historical trials, at least for classification purpose. James Hung, 2004 FDA/Industry

Statistical Analysis Methods Literature guidance 1) No satisfactory statistical analysis method for handling non-ignorable Statistical Analysis Methods Literature guidance 1) No satisfactory statistical analysis method for handling non-ignorable missing data 2) Likelihood-based methods require assumptions about missing data mechanism (unverifiable from current trial data) James Hung, 2004 FDA/Industry

Facts 1) Validity of any analysis method is very much in question. 2) Better Facts 1) Validity of any analysis method is very much in question. 2) Better alternative method is unclear. Use of current trial data to seek imputation method is futile. 3) Dropouts and missing data are unavoidable. James Hung, 2004 FDA/Industry

Glimpse of the analysis problem = µ 1 - µ 2 at last time Glimpse of the analysis problem = µ 1 - µ 2 at last time point ni = # of completers in group i fi = ni/Ni If there is no missing value, we have D = Y 1 – Y 2 (unbiased for ) V(D) = estimated variance of D Z = D/[V(D)]1/2 James Hung, 2004 FDA/Industry

Missing values { D , V(D), Z } not obtainable. Can try to get Missing values { D , V(D), Z } not obtainable. Can try to get E( D | data) and V( D | data). and construct Z* = E( D | data ) / [V( D | data )]1/2 or Z+ = E( D | data ) / [V(D)]1/2 James Hung, 2004 FDA/Industry

Yoi = sample mean of completers Ri = vector of indicators for completion or Yoi = sample mean of completers Ri = vector of indicators for completion or dropout Ymi = unobservable sample mean of dropouts James Hung, 2004 FDA/Industry

Immediately, when f 1 ≠ f 2, this statistic has problem of interpretation, unless Immediately, when f 1 ≠ f 2, this statistic has problem of interpretation, unless Ri and Ymi are independent (MI). Under MI, E(Ymi | Yoi, Ri ) = E(Ymi). And if E(Ymi) = µi, then completer analysis might offer a reasonable estimate of . James Hung, 2004 FDA/Industry

When f 1 = f 2 = f, a linear combination of obs sample When f 1 = f 2 = f, a linear combination of obs sample mean difference of completers and difference in conditional mean of dropouts (the latter requires models). James Hung, 2004 FDA/Industry

What about Var (D | data)? Another formidable task ! Nonlikelihood-based methods are difficult What about Var (D | data)? Another formidable task ! Nonlikelihood-based methods are difficult to provide useful solutions unless some kind of ad-hoc conservative imputation is feasible. James Hung, 2004 FDA/Industry

LOCF (last observation carried forward) LOCF tests H 0: K = 0. LOCF can LOCF (last observation carried forward) LOCF tests H 0: K = 0. LOCF can be biased either in favor of test drug (e. g. , when its effect decays over time*) or against test drug, even in case of MCAR. *Siddique and Hung (2003) James Hung, 2004 FDA/Industry

For assessing drug effect over time, LOCF can seriously underestimate variability of measurement and For assessing drug effect over time, LOCF can seriously underestimate variability of measurement and is unrealistic (i. e. , impute a constant value for every visit after the patient dropped out). James Hung, 2004 FDA/Industry

LAO (last available observation) Operationally identical to LOCF, this tests some global drug effect LAO (last available observation) Operationally identical to LOCF, this tests some global drug effect over time, H 0: w 1 hµ 1 h = w 2 hµ 2 h Wih= E(dropout rate of drug group i at time h) μih = expected response of patients dropping out after time h in drug group i Is this null hypothesis relevant? Shao and Zhong (2003) James Hung, 2004 FDA/Industry

LOCF versus LAO (in red) 1 Y 2 1 3 2 3 v 0 LOCF versus LAO (in red) 1 Y 2 1 3 2 3 v 0 v 1 1 2 1 v 2 v 3 James Hung, 2004 FDA/Industry

The global mean µi = wihµih can be unbiasedly estimated by the sample mean. The global mean µi = wihµih can be unbiasedly estimated by the sample mean. But the usual MSE from ANOVA may not estimate right target (Shao and Zhong). LAO results can be difficult to interpret if dropout reasons or dropout rates are different in treatment groups. James Hung, 2004 FDA/Industry

If drug effect over time is at issue, why not use all the pertinent If drug effect over time is at issue, why not use all the pertinent data (longitudinal data analysis should be more efficient than LAO). - need medical colleagues’ buy in Ex. Analysis of cuff BP over time may be more powerful (value of test statistic is much larger) than LAO Hung, Lawrence, Stockbridge, Lipicky (2000) James Hung, 2004 FDA/Industry

MMRM* (mixed-effect model repeated measure with saturated model) Response = µ + treatment + MMRM* (mixed-effect model repeated measure with saturated model) Response = µ + treatment + time + treatment*time + baseline + subject (treatment) + error subject (treatment) and error are random effects treatment and time are class variables *Mallinckrodt et al (2001) James Hung, 2004 FDA/Industry

MMRM* analysis used to test H 0: K = 0. - statistically valid under MMRM* analysis used to test H 0: K = 0. - statistically valid under MAR - seem more stable in terms of type I error rate than LOCF under MCAR or MAR*# (LOCF can be very bad, depending on at other visits) *Mallinckrodt et al (2001) #Siddique and Hung (2003) James Hung, 2004 FDA/Industry

LOCF, LAO, MMRM can be very problematic in case of informative missing. Don’t know LOCF, LAO, MMRM can be very problematic in case of informative missing. Don’t know how to do ‘conservative’ imputation with these methods. James Hung, 2004 FDA/Industry

Worst rank/score analysis Test drug effect at time K in the presence of events Worst rank/score analysis Test drug effect at time K in the presence of events (e. g. , death) that cause informatively missing values of the primary study outcome at time K. Example: In congestive heart failure trials, exercise time is missing after death from heart failure. Lachin (1999) James Hung, 2004 FDA/Industry

Assign a worst score to any informatively missing values (due to occurrence of an Assign a worst score to any informatively missing values (due to occurrence of an absorbing event related to progression of disease) and perform a nonparametric rank analysis. Valid and efficient for testing H 0: no treatment difference in distributions of both event time and main study outcome Lachin (1999) James Hung, 2004 FDA/Industry

For a drug having little effect on nonmortal outcome (e. g. , exercise time), For a drug having little effect on nonmortal outcome (e. g. , exercise time), this analysis when used to test non-mortal effect can be anti-conservative if the drug improves survival. Unclear how to perform a reasonable test for the non-mortal effect alone (e. g. , labeling issue) James Hung, 2004 FDA/Industry

Time to treatment failure analysis In time to event analysis, if test drug has Time to treatment failure analysis In time to event analysis, if test drug has severe side effects that cause more dropouts, then time to treatment failure (event or dropping out due to side effects) analysis may provide a conservative analysis. James Hung, 2004 FDA/Industry

Like the worst score/rank analysis, it is unclear how to perform a reasonable test Like the worst score/rank analysis, it is unclear how to perform a reasonable test for time to the interested event alone - censoring on dropout due to failure ? James Hung, 2004 FDA/Industry

WLP opposite/pooled imputation For binary outcome, opposite imputation imputes sample event rate of completers WLP opposite/pooled imputation For binary outcome, opposite imputation imputes sample event rate of completers in one arm for unobserved event rate of incompleters in the opposite arm. Wittes, Lakatos, Prostfield (1989) Proschan et al (2001) James Hung, 2004 FDA/Industry

Pooled imputation imputes sample event rate of completers from both arms for unobserved event Pooled imputation imputes sample event rate of completers from both arms for unobserved event rate of noncompleters in each arm. Treat imputed rate as ordinary rate. Compute Z statistic in the ordinary manner using a combination of the observed and the imputed rates. Wittes et al (1989), Proschan et al (2001) James Hung, 2004 FDA/Industry

WLP is less conservative than the worst case analysis (assign ‘event’ to dropouts in WLP is less conservative than the worst case analysis (assign ‘event’ to dropouts in the test drug group and ‘nonevent’ to dropouts in the control group). Proschan et al (2001) James Hung, 2004 FDA/Industry

Partial list of other well-known methods Likelihood-based method Pattern-mixture model selection model Non-likelihood based Partial list of other well-known methods Likelihood-based method Pattern-mixture model selection model Non-likelihood based method GEE Ad hoc imputation method James Hung, 2004 FDA/Industry

Methodology Consideration O’Neill (2003, 2004) - better assume NIM in planning stage missing data Methodology Consideration O’Neill (2003, 2004) - better assume NIM in planning stage missing data process not directly verifiable - choice of approach as the primary strategy for handling missing data ? - choice of approaches for sensitivity analysis, robustness analysis ? James Hung, 2004 FDA/Industry

Unnebrink and Windeler (2001) • adequacy of ad hoc strategy (e. g. , LOCF, Unnebrink and Windeler (2001) • adequacy of ad hoc strategy (e. g. , LOCF, ranking, imputation of mean of other group, etc) for handling missing value depends on whether the courses of disease are similar in the study groups • For large dropout rates or different courses of disease, no adequate recommendations can be given James Hung, 2004 FDA/Industry

In planning strategies for handling missing values, we need to consider: 1) Null hypothesis In planning strategies for handling missing values, we need to consider: 1) Null hypothesis should be carefully defined in anticipation of missing data. It should not be altered by the presence of missing data after trial is done, regardless of their pattern. James Hung, 2004 FDA/Industry

2) For design, every attempt needs to be made to minimize dropouts. Alternative designs 2) For design, every attempt needs to be made to minimize dropouts. Alternative designs (e. g. , enrichment design*, randomized withdrawal*) may be used to narrow the study population (recognize problem of generalizability), if ITT population cannot be properly studied. *Temple (2004) James Hung, 2004 FDA/Industry

3) For analysis, the method needs to facilitate ‘conservative’ imputation to: - adjust the 3) For analysis, the method needs to facilitate ‘conservative’ imputation to: - adjust the effect estimate toward null - inflate variability (double discounting for possible exaggeration from imputation of missing data), e. g. , some type of worst score or rank. James Hung, 2004 FDA/Industry

Seek missing mechanism model to help imputation. This needs to use knowledge of disease Seek missing mechanism model to help imputation. This needs to use knowledge of disease process (how? Need to get practical experiences) The model needs to be flexible for sensitivity/robustness analysis. 4) Note: such model is not verifiable James Hung, 2004 FDA/Industry

5) Conduct better pilot trials or analyze historical data to explore response profiles of 5) Conduct better pilot trials or analyze historical data to explore response profiles of dropouts by reasons to see if missing mechanism may be related to outcome, and propose a reasonably conservative imputation method James Hung, 2004 FDA/Industry

Key to ‘reasonable’ imputation = µ 1 - µ 2 at last time point Key to ‘reasonable’ imputation = µ 1 - µ 2 at last time point ni = # of completers in group i If there is no missing value, we have D = Y 1 – Y 2 (unbiased for ) V(D) = estimated variance of D Z = D/[V(D)]1/2 James Hung, 2004 FDA/Industry

Missing values { D , V(D), Z } not obtainable. Can try to get Missing values { D , V(D), Z } not obtainable. Can try to get E( D | data) and V( D | data). And thus we construct Z* = E( D | data ) / [V( D | data )]1/2 or Z+ = E( D | data ) / [V(D)]1/2 All need models. Proschan et al (2001) James Hung, 2004 FDA/Industry

Goal is to use of a model such that |Z*| ≤ |Z| or |Z+| Goal is to use of a model such that |Z*| ≤ |Z| or |Z+| ≤ |Z|. Since functional forms of E(D | data) and V(D | data) are unavailable, use of linear model to remove 1 st-order effect of data is the first step. Then, what is the impact of imposing such model on estimation of V(D | data) or V(D)? James Hung, 2004 FDA/Industry

SUMMARY Intent-to-treat is the goal. If the dropout rate is high, interpretable intent-to-treat analysis SUMMARY Intent-to-treat is the goal. If the dropout rate is high, interpretable intent-to-treat analysis may not be achievable. Alternative designs (e. g. , enrichment design) that narrow study population may need to be considered (caveat: generalizability of interpretation). James Hung, 2004 FDA/Industry

Intuitively, use of all data seems to be more promising than use of end Intuitively, use of all data seems to be more promising than use of end point data to offer better guidance as to how to reasonably impute missing values. Yet, this advantage comes with a price that unverifiable statistical models must be dependent on. Thus, every method needs to facilitate ‘conservative’ imputation approach. James Hung, 2004 FDA/Industry

For regulatory applications, every attempt needs to be made to: - minimize dropout - For regulatory applications, every attempt needs to be made to: - minimize dropout - explore response pattern of dropout in order to be able to propose a reasonably conservative imputation method - propose conservative strategies for primary analysis and sensitivity analyses James Hung, 2004 FDA/Industry

Selected References Lachin (1999, Controlled Clinical Trials) Unnebrink, Windeler (2001, Statistics in Medicine) Shao, Selected References Lachin (1999, Controlled Clinical Trials) Unnebrink, Windeler (2001, Statistics in Medicine) Shao, Zhong (2003, Statistics in Medicine) Proschan, Mc. Mahon, et al (2001, Journal of Statistical Planning and Inference) Wittes, Lakatos, Probstfield (1989, Statistic in Medicine) Mallinckrodt et al (2003, ASA JSM) Siddique, Hung (2003, ASA JSM) Hung, Lawrence, Stockbridge, Lipicky (2000, unpublished manuscript) James Hung, 2004 FDA/Industry

O’Neill (2003, ASA JSM; 2004, DIA Euro. Meeting) Temple (1994 -2004, Lecture notes on O’Neill (2003, ASA JSM; 2004, DIA Euro. Meeting) Temple (1994 -2004, Lecture notes on Clinical Trial Designs) Temple (2004, Society of Clinical Trials talk) James Hung, 2004 FDA/Industry