We have read with interest the article “Achieving deeper molecular response is associated with a better clinical outcome in chronic myeloid leukemia patients on imatinib front-line therapy” by Etienne et al.1 The article covers an important topic, namely the prognostic value of the surrogate end point “complete molecular response”. But, in our opinion, it has some serious methodological problems concerning the handling of time-varying variables, i.e. the remission status. Since we have seen this kind of analysis reported at many conferences and in the literature, we think it is necessary to discuss the shortfalls of this approach.
Etienne et al. estimated the time to an event or failure starting at the date of the complete cytogenetic remission (CCyR) via Kaplan-Meier curves. Event and failure were defined in the sense of the so-called event-free survival (EFS) and failure-free survival (FFS), respectively. Patients were divided into three groups: CCyR without major molecular remission (CCyR+MMR−), CCyR with MMR but without complete molecular remission (CCyR+MMR+CMR−), and CCyR with MMR and CMR (CCyR+MMR+CMR+). Censoring was carried out at the last observational period under imatinib. Patients without an event or failure that were not observed by at least the median time to CMR were excluded from the analysis.
The well-known Kaplan-Meier estimator for right-censored data has three important assumptions. Firstly, the patients’ samples have to be defined at the starting point independent of the (future) outcome. Secondly, each patient who had not already experienced an event has to continue to be at risk. Thirdly, censoring has to be independent of the outcome as well.
One may argue whether censoring at the last observational period under imatinib can really be considered to be independent of the outcome “death from any cause on or off therapy”, but at least the first and second assumption have been violated here. At the time of CCyR, it is unknown whether or not the patient will achieve a CMR in the future. But to achieve a CMR, it is necessary that the patient survives without an event or failure for a certain period of time. Thus, the classification of the patients into remission groups depended heavily on the outcome and not (as required) by a base-line variable. Excluding the patients with short observation times and without an event, as the Authors did, does not improve the situation but rather increases the problem.
Furthermore, the patients in the CMR group were not at risk until they had achieved a CMR. If they had experienced an event before achieving a CMR they would be in one of the other groups. In contrast, the patients of the CCyR+MMR− group were already at risk from the beginning. Thus one would expect an advantage for the CCyR+MMR+CMR+ group with respect to EFS or FFS even if EFS or FFS were completely independent of the patient’s remission status. This serious time-dependent bias has already been described and discussed in the literature32 and to a fair degree invalidates the results and conclusions of the Authors.
Treating future outcome information as if it were already known at baseline does not only lead to biased results, it yields practical problems with the interpretation too. If a patient asks a physician at the time of his CCyR, which survival curve is relevant for him, what would be the answer? Notably, this approach mixes up cause and effect. If a patient, for example, discontinued the treatment due to toxicity two months after the achievement of a CCyR, the authors’ approach would conclude (if we assume there is a causal relationship between both events): “The patient suffered from toxicity that caused a treatment cessation because of the non-achievement of a CMR in the future” instead of the more plausible interpretation: “The patient did not achieve a CMR because of treatment cessation for toxicity”.
There are some options to deal with the problem of time-varying variables. The landmark approach4 is widely used and allows for a simple interpretation. However, as stated by Etienne et al., it has the limitation that its perspective depends on the chosen landmark time. The choice of the landmark is crucial, but is often made arbitrarily. The approach has, therefore, been criticized.
Time-varying covariates like remission status can also be included in the Cox model. This approach does not require one specified landmark but uses all the information on the remission status. The major disadvantage of this approach is that it does not provide a graphical representation. If a graphical representation is needed, the method proposed by Simon and Makuch5 would be an alternative option.
Besides the methodological problems with the time-dependent bias, we are concerned about the trend in using combined surrogate end points like EFS and FFS that combine outcomes of very different severities. The shortcomings of these end points have already been described in the literature.76 Table 2 of the paper by Etienne et al. gives an overview on the separate end points. The most serious outcomes, death and progression, were infrequent in all groups (regardless of the problems described above). Loss of CHR and CCyR was the main problem in the CCyR+MMR− group, while for patients who had achieved at least one MMR, treatment cessation for toxicity seemed to be the biggest issue. Since, from our point of view, toxicity and loss of CHR or CCyR are caused by different biological mechanisms and often require different actions to be taken by the physician, we consider it questionable to combine these end points. Instead, they should be analyzed separately using a competing risk approach, probably with a larger number of events.
References
- Etienne G, Dulucq S, Nicolini F, Morisset S, Fort M, Schmitt A. Achieving deeper molecular response is associated with a better clinical outcome in chronic myeloid leukemia patients on imatinib frontline therapy. Haematologica. 2014; 99(3):458-64. PubMedhttps://doi.org/10.3324/haematol.2013.095158Google Scholar
- Beyersmann J, Wolkewitz M, Schumacher M. The impact of time-dependent bias in proportional hazards modelling. Stat Med. 2008; 27(30):6439-54. PubMedhttps://doi.org/10.1002/sim.3437Google Scholar
- Mantel N. Responder versus nonresponder comparisons: daunorubicin plus prednisone in treatment of acute nonlymphocytic leukemia. Cancer Treat Rep. 1983; 67(3):315-6. PubMedGoogle Scholar
- Anderson JR, Cain KC, Gelber RD. Analysis of survival by tumor response. J Clin Oncol. 1983; 1(11):710-9. PubMedGoogle Scholar
- Simon R, Makuch RW. A non-parametric graphical representation of the relationship between survival and the occurrence of an event: application to responder versus non-responder bias. Stat Med. 1984; 3(1):35-44. PubMedhttps://doi.org/10.1002/sim.4780030106Google Scholar
- Kantarjian H, O’Brien S, Jabbour E, Shan J, Ravandi F, Kadia T. Impact of Treatment End Point Definitions on Perceived Differences in Long-Term Outcome With Tyrosine Kinase Inhibitor Therapy in Chronic Myeloid Leukemia. J Clin Oncol. 2011; 29(23):3173-8. PubMedhttps://doi.org/10.1200/JCO.2010.33.4169Google Scholar
- Pfirrmann M, Hochhaus A, Lauseker M, Saussele S, Hehlmann R, Hasford J. Recommendations to meet statistical challenges arising from endpoints beyond overall survival in clinical trials on chronic myeloid leukemia. Leukemia. 2011; 25(9):1433-8. PubMedhttps://doi.org/10.1038/leu.2011.116Google Scholar