Abstract
Clinical trial eligibility criteria can unfairly exclude patients or unnecessarily expose them to known risks if criteria are not concordant with drug safety. There are few data evaluating the extent to which acute leukemia eligibility criteria are justified. We analyzed criteria and drug safety data for front-line phase II and/or III acute leukemia trials with start dates 1/1/2010-12/31/2019 registered on clinicaltrials.gov. Multivariable analyses assessed concordance between criteria use and safety data (presence of criteria with a safety signal, or absence of criteria without a signal), and differences between criteria and safety-based limits. Of 250 eligible trials, concordant use of ejection fraction criteria was seen in 34.8%, corrected QT level (QTc) in 22.4%, bilirubin in 68.4%, aspartate transaminase/alanine aminotransferase (AST/ALT) in 58.8%, renal function in 68.4%, human immunodeficiency virus (HIV) in 54.8%, and hepatitis B and C in 42.0% and 41.2%. HIV and hepatitis B and C criteria use was concordant with safety data (adjusted Odds Ratios 2.04 [95%CI: 1.13, 3.66], 2.64 [95%CI: 1.38, 5.04], 2.27 [95%CI: 1.20, 4.32]) but organ function criteria were not (all P>0.05); phase III trials were not more concordant. Bilirubin criteria limits were the same as safety-based limits in 16.0% of trials, AST/ALT in 18.1%, and renal function in 13.9%; in 75.7%, 51.4%, and 56.5% of trials, criteria were more restrictive, respectively, by median differences of 0.2, 0.5, and 0.5 times the upper limits of normal. We found limited drug safety justifications for acute leukemia eligibility criteria. These data define criteria use and limits that can be rationally modified to increase patient inclusion and welfare.
Introduction
Clinical research is ethical when there is fair subject selection and a favorable risk-benefit ratio.1 Recent analyses of solid tumor clinical trials found that eligibility criteria can disproportionately exclude potential participants from historically marginalized groups.2,3 Unless there is a safety-based justification for these criteria, such exclusions are discriminatory and reduce the number of eligible patients, slowing recruitment and inhibiting trial completion. For instance, eliminating exclusions for manageable medical conditions not thought to be justified by drug safety (e.g., coronary stenting, diabetes mellitus) in pancreatic cancer trials increased overall eligibility and reduced disparities for Black compared to White patients (with exclusions, ineligibility rates were 42.4% vs. 33.2%, P=0.02; without exclusions 26.8% vs. 24.8%, P>0.05).2 In this and other examples in non-small cell lung cancer,3 such modifications enhance generalizability and arguably the scientific value necessary for ethical research.1 At the same time, criteria must be used to protect participants from unnecessary risk when safety signals are known. In late 2022, the United States (US) Food and Drug Administration (FDA) was empowered by passage of the Food and Drug Omnibus Reform Act (FDORA) to promote rational revision of criteria to reduce exclusion and enhance safety.4-6
Like unjustified criteria, unjustified variability in the limits assigned to criteria (e.g., the value that defines minimum acceptable renal function) also represents unfair exclusion or risk unless differences are drug safety-based.1 Data on the variability of eligibility criteria limits and their justification in drug safety are few. The use and variability of criteria across phase II and III trials in acute myeloid and lymphoblastic leukemia have also not been well described. Without similar data in leukemia, the blood cancer research community cannot assess where better alignment of criteria and drug-associated risks is needed, and unnecessary exclusion and risk will continue. In this context, we sought to characterize the use, variability, and drug safety-based justification of eligibility criteria, as well as the extent to which they could be rationally modified to promote inclusion and safety.
Methods
Design and objectives
This was a retrospective analysis of phase II and III acute leukemia clinical trial eligibility criteria and their basis in drug safety. Objectives were to assess concordance between trial eligibility criteria and drug safety profiles (with concordance defined as the presence of an eligibility criterion with a drug safety signal or absence of a criterion without a signal), and to determine differences between criteria-based and drug safety-based limits and their variability. We also examined whether concordance and differences improved in later phase studies. Manuscript reporting followed PRISMA guidelines, as applicable. Ethical approval for the study was provided by the Dana-Farber Cancer Institute Office for Human Research Studies as protocol 22-267.
Trial, criteria, and drug safety selection, abstraction, and coding
To assess potential changes in criteria that could be made under the aegis of a single regulatory standard, the search focused on trials registered with the US FDA. As US law FDAAA 801 requires all therapeutic clinical trials in the US to be registered on clinicaltrials.gov,7 this database was selected for review and queried on 10/6/2022. The study period (2010-2019) was defined to allow sufficient time after enactment of FDAAA 801 (circa 2008) for all new studies to be registered. The search terms and filters are described in the Online Supplementary Appendix. The results were reviewed to exclude trials that did not meet the above criteria, did not test therapies to treat acute leukemia, tested cellular therapies, only tested therapies in relapsed/ refractory disease, and/or only recruited patients outside the US. Searches, result screening, and eligibility criteria coding were independently performed by two trained team members, and consensus arbitration with a third was used to resolve discrepancies.8
An initial list of eligibility criteria was developed collaboratively by the study team based on FDA guidance.4,5 Criteria were coded into binary, categorical, or continuous variables based on FDA guidance when possible and common categories when not. The drug safety review sought to identify safety signals known at the time of study initiation that would justify the use of enrollment criteria. All anti-cancer therapies tested were compiled and reviewed to identify safety data available at the time of the study start date.
Details on criteria and drug safety identification and coding, and the lists of variables used, are shown in the Online Supplementary Methods, Online Supplementary Tables S1, S2.
Statistical analysis
Criteria use and limit differences were reported using descriptive statistics with frequencies and percentages. Criteria limit dispersion was reported as medians with interquartile ranges (IQR) and robust coefficients of variation (rCV; the IQR divided by the median).9 Univariable assessments of concordance between laboratory criteria and drug safety signals, and odds of concordance between phase III and phase II studies, were assessed using χ2 or Fisher’s exact testing with Odds Ratios and 95% Confidence Intervals. Adjusted Odds Ratios (aOR) of concordance were calculated using multivariable logistic regression models with criteria use as the dependent variable and presence of a relevant drug safety signal as the independent variable (e.g., known association with QT interval prolongation for corrected QT [QTc] criteria), adjusted for phase, year, size, and trial sponsor type(s).10,11 Improvement in alignment of criteria and safety-based limits from phase II to phase III was assessed by the absolute reduction in the median difference between criteria- and safety based-limits, with significance assessed using the two-sample Mann-Whitney U test, and the corresponding IQR difference, with significance assessed using the Brown-Forsythe equality of variances test.12 Statistical testing was performed using STATA Version 16.1 (StataCorp; College Station, TX, US). The significance level was pre-specified as alpha=0.05.
Results
The search resulted in 848 acute myeloid leukemia (AML) and 459 acute lymphocytic leukemia (ALL) trials, of which 250 (19.1%) were eligible for analysis: 190 AML (22.4%) and 60 ALL (13.1%). A flow diagram of exclusions is shown in Online Supplementary Figure S1. Most trials were phase II (203; 81.2%); 47 trials (18.8%) were phase III. Eighty (32.0%) trials were sponsored by the National Institutes of Health, 151 (60.4%) by academic investigators, and 136 (54.4%) by industry. A total of 74 (29.6%) trials had participating sites outside the US. Across all trials there were 162 unique anti-cancer therapies tested (Online Supplementary Table S3), of which 83 (51.2%) were not FDA-approved at the study start date. Approximately half of the trials (135; 54.0%) tested at least one drug that was investigational at study start.
Variability in the use of common eligibility criteria with associated limits is shown in Table 1. Limit variability, measured as rCV, was >10% for the age demarcation of an older adult (12.5%) and for renal function (35.0%), bilirubin (22.5%), drug washout period (75.0%), and prior malignancy washout period (75.0%) limits. Violin plots of these measures’ variability are shown in Online Supplementary Figure S2. Common exclusion criteria without associated limits included diagnosis of human immunodeficiency virus (HIV) (137 trials; 54.8%), hepatitis B (105; 42.0%) and C (103; 41.2%) infection, and central nervous system (CNS) disease (117; 46.8%). Cytoreduction was specifically allowed in 122 (48.8%) trials.
Concordance between criteria with limits and relevant drug safety signals is shown in Figure 1. There was concordance between the presence of ejection fraction limits and chemotherapy associated with congestive heart failure risk in 115 (46.0%) trials. Criteria concordance with other relevant risks was seen for QTc limits in 146 (58.4%) trials, bilirubin in 170 (68.0%) trials, aspartate transaminase/ alanine aminotransferase (AST/ALT) in 148 (59.2%) trials, and renal function in 159 (63.6%) trials. Exclusion criteria for HIV, hepatitis B, and hepatitis C were concordant with drug safety in 145 (58.0%), 135 (54.0%), and 131 (52.4%) of trials, respectively.
Unadjusted OR and aOR of concordance are shown in Table 2: HIV and hepatitis B and C criteria use was concordant with drug safety data (aOR 2.04 [95%CI: 1.13, 3.66]; 2.64 [95%CI: 1.38, 5.04]; 2.27 [95%CI: 1.20, 4.32]) but organ function criteria were not (all aOR P>0.05). Sponsor type was associated with eligibility criteria use in several instances. Industry sponsorship was associated with increased odds of hepatitis B and C exclusions (aOR 2.45 [95%CI: 1.13, 5.30] and 2.59 [95%CI: 1.19, 5.64]) and academic sponsorship with increased odds of bilirubin and renal function exclusions (aOR 3.14 [95%CI: 1.62, 6.07] and 2.84 [95%CI: 1.50, 5.36]). There were no significant differences in what limits were placed according to the presence or absence of a drug safety signal (all Mann-Whitney U test P>0.05) (Figure 2). Odds of concordance between criteria use and drug safety data were numerically lower for phase III studies compared to phase II studies (Online Supplementary Table S4, Online Supplementary Figure S3); odds were less concordant for bilirubin (OR 0.41 [95%CI: 0.21, 0.78]), AST/ALT (OR 0.43 [95%CI: 0.23, 0.83]), and renal function limits (OR 0.53 [95%CI: 0.28, 0.99]). We next assessed the limits that were used when there was concordance between the presence of a criterion and drug safety data. Drug safety-based and criteria-based limits for bilirubin were the same in 16.0% of these trials, for AST/ALT in 18.1%, and for renal function in 13.9%; in 75.7%, 51.4%, and 56.5% of trials the criteria were more restrictive, respectively. Specific drug safety limits for left ventricular ejection fraction (LVEF) and QTc were too few for comparison. Differences in renal and hepatic function limits from eligibility criteria and drug safety data are shown in Figure 3. The median absolute differences between eligibility criteria and drug safety-based limits were 0.5 times the upper limit of normal (ULN) for renal function, 0.5 ULN for AST/ ALT, and 0.2 ULN for bilirubin; the respective IQR were 0.9, 2.5, and 1.3.
Limit differences between phase II and phase III studies are shown in Online Supplementary Figure S4. Differences in median renal function limit in phase III trials were significantly different and closer to zero (Mann Whitney U test P=0.03); bilirubin and AST/ALT limits were not significantly different in phase III. The variability of these differences was similar between phase II and III trials for renal function (IQR 0.81 and 0.50), AST/ALT (2.50 and 2.50), and bilirubin (1.30 and 0.97) (all Brown-Forsythe test P>0.05).
Discussion
In this analysis assessing the use, variability, and justification of acute leukemia clinical trial eligibility criteria, there was a substantial discordance between criteria and drug safety data. The use or absence of many criteria were related to anticipated risk, but large proportions were not. There was concordance between infectious disease criteria and risk, but this was not the case for organ function criteria, and the imposed criteria limits were similarly restrictive irrespective of anticipated risk. Despite the presumption that additional safety data would become available later in drug development, criteria use in phase III studies was not more concordant. When trial criteria use was concordant with safety data, the imposed criteria limits were generally more restrictive than drug safety data and were variable in their degree of restriction. Among these studies, phase III limits were somewhat more aligned with drug safety data than phase II, but their limits were no less variable. Together, these data identify specific criteria that sponsors and investigators should assess for inclusion, removal, and liberalization in order to improve representativeness and accrual, limit the implicit biases that can occur when criteria are not explicit, and improve patient safety.13
These data are consistent with solid tumor studies that demonstrate the presence of unjustified criteria and their potential impact on demographic inequities in participation.2,3 One such study used the Flatiron real-world database and found that if several enrollment criteria for non-small cell lung cancer immunotherapy trials were removed (e.g., neutrophil counts, CNS metastasis), the eligible population would be doubled and would only decrease the projected Hazard Ratio of overall survival by 0.05.3 A study in pancreatic cancer found that Black patients would have been less likely to participate owing to hepatitis and HIV status.2 The lack of justification for infectious disease exclusions in some of the trials in this analysis suggests this inequity may also be present in acute leukemia.
To our knowledge, prior studies regarding acute leukemia enrollment criteria are aligned but distinct and do not address the question of use, variability, or justification across phase II and III studies. A trial published in 2017 enrolled patients with comorbid conditions, organ dysfunction, and poor performance status to show that a trial of low-intensity therapy was possible in this traditionally ineligible population.14 This is important in that it shows trials, rather than just post-approval real-world analyses, are possible for most patient populations. At the same time, it did not assess heterogeneity in trial criteria and when criteria are justified or unjustified. A separate study from the FDA analyzed disparities in ineligibility of screened patients on 13 trials from 2016 to 2019 (N=3192), finding that 27% were ineligible. These data also showed that Black and Hispanic patients with AML were less likely to meet study eligibility requirements due to cardiac function and/or a lack of specific mutations.15 Notably, there was significant selection bias as the analysis was restricted to those who consented to participate, meaning that data on those who were known to be ineligible, and thus not approached, were not included.
The present study adds significantly to this literature because it not only assesses criteria use but their variability and justification in drug safety. We found that there was wide variability within criteria with associated measures. Some criteria such as time since prior cancer diagnosis have limited justification, and others like bilirubin limits were not present despite known drug-related hepatotoxicity. We also saw that if a criterion was used, it was generally more restrictive than suggested by drug safety data. Criteria used were also similarly restrictive independent of if there were known drug-related risks. Taken together, these data identify criteria that should be specified and others that can be liberalized to promote a rational approach to maximizing the eligible pool of patients without compromising safety.
The systematic approach used to catalog study drug safety profiles relative to enrollment criteria also minimizes the bias of prior studies, which did not account for the more limited knowledge of drug safety known during investigational drug development. Though much of the previously published literature focuses on criteria expansion, our study also identifies areas where known safety signals exist, but explicit criteria are missing. This can put enrollees at unnecessary risk, limit the identification of efficacious drugs, and may increase enrollment bias by leaving related eligibility decisions to individual investigators.
Applying these points to a study included in the analysis, we can see how trial eligibility may be modified in practice. This study tested the combination of nivolumab, cytarabine, and idarubicin in AML, and used exclusion criteria based on hepatic and renal lab values. The labels for the antineoplastic therapies used collectively identify hepatic and renal toxicity (among other side effects), justifying the trial’s use of exclusion criteria related to liver and kidney function. The study’s hepatic and renal criteria limits were bilirubin ≤1.5, AST/ALT ≤2.5, and creatinine ≤1.3 times the ULN, and the conservative limits from drug labels were ≤2, ≤3, ≤2.2 times the ULN, respectively. In this case, expansion of renal and hepatic criteria by 0.5, 0.5, and 1.1 times the ULN could be justified and potentially increase the eligible population while maintaining safety. A similar process, applied systematically across new trials, would align eligibility with drug safety, expanding criteria limits in some trials, contracting limits in others, and making criteria explicit instead of inexact in the rest.
Limitations of this study include the potential for bias due to safety data that were unpublished at the time of study start, which we attempted to minimize through multi-modal searches and use of trial protocols. This mirrors the investigational drug safety data and eligibility requirements that enrolling physicians use during recruitment. While we focused on drug-safety profiles from individual drugs, we could not capture safety justifications based on emerging issues related to the overall intensity of a regimen, where general eligibility restrictions for organ dysfunction or comorbidities may be used. Nonetheless, there were a number of criteria not included where safety signal(s) were identified which would not have been ameliorated. The generalizability of the analysis is limited to clinical trials conducted under FDA regulation, which was done to assess changes in criteria that could be made under a single regulatory standard. Cellular therapy trials were also excluded as the population eligible for cellular therapy treatment, even outside the research context, is much more restricted than the broader acute leukemia population. Other limitations include biases due to the necessary aggregation of unstructured data into analytic variables, which limited our ability to capture vague or inexact criteria, and required us to assume a uniform normal range of laboratory values, and the moderate sample size, which was limited by the availability of trials.
In summary, a substantial proportion of acute leukemia clinical trial enrollment criteria do not appear justified by drug safety. Both the FDA and the American Society of Clinical Oncology have recognized this issue and adopted positions requiring justification of criteria;16-18 these data identify where potentially unjustified criteria and limits exist for acute leukemia and a rational approach to meeting this goal while ensuring patient wellbeing. Judicious minimization of criteria based on drug safety profiles and standardization are key to enhancing research representativeness, efficacy, and safety.
Footnotes
- Received June 13, 2023
- Accepted August 4, 2023
Correspondence
Disclosures
AH has received personal fees (advisory boards) from AstraZeneca and AbbVie. MRL reports receiving personal fees (advisory boards) from Novartis and AbbVie. AAP has received institutional research funding from Kronos Bio, Pfizer, and Sumitumo, and personal fees from AbbVie (educational curriculum development) and Bristol Myers Squibb (advisory board). DJDeA has received research funding from Abbvie, Novartis, Blueprint, and Glycomimetrics, and personal fees from Amgen, Autolus, Blueprint, Gilead, Incyte, Jazz, Kite, Novartis, Pfizer, Servier, and Takeda (consulting). GAA has received personal fees (consulting) from Novartis. The other authors have no conflict of interests to disclose.
Contributions
AH, CSL, GAA, IK and AAP are responsible for the study concept. AH, EQ, TPW and GAA designed the methodology. AH, EW and TPW collected the data. AH, EW, TPW and GAA analyzed the data. AH, EW, TPW and GAA wrote the manuscript. MRL, IK, AAP, DJDeA, CSL and GAA critically reviewed the manuscript. All authors revised the manuscript for publication.
Funding
AH is supported by grants from the National Cancer Institute of the National Institutes of Health (K08 CA273043), the American Society of Clinical Oncology (Career Development Award), the Alliance in Clinical Trials for Oncology (Special Projects Fund), and the Rieder Family Fellowship in Acute Lymphoblastic Lymphoma. IK is supported by a grant from the American Cancer Society (CSDG-21-088-01-ET). The sponsors had no role in gathering, analyzing, or interpreting the data. The content of this publication does not necessarily reflect the views or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products, or organizations imply endorsement by the US government.
Acknowledgments
The authors would like to acknowledge the work of Dillon Clancy in assisting with data collection and preparing the figures.
References
- Emanuel EJ, Wendler D, Grady C. What makes clinical research ethical?. JAMA. 2000; 283(20):2701-2711. Google Scholar
- Riner AN, Girma S, Vudatha V. Eligibility criteria perpetuate disparities in enrollment and participation of black patients in pancreatic cancer clinical trials. J Clin Oncol. 2022; 40(20):2193-2202. Google Scholar
- Liu R, Rizzo S, Whipple S. Evaluating eligibility criteria of oncology trials using real-world data and AI. Nature. 2021; 592(7855):629-633. Google Scholar
- Duggal M, Sacks L, Vasisht KP. Eligibility criteria and clinical trials: an FDA perspective. Contemp Clin Trials. 2021; 109:106515. Google Scholar
- Cancer clinical trial eligibility criteria: available therapy in non-curative settings. 2022. Publisher Full TextGoogle Scholar
- Food and Drug Omnibus Reform Act of. 2022. Google Scholar
- National Library of Medicine. ClinicalTrials.gov. 2022. Publisher Full TextGoogle Scholar
- Flick U. The SAGE handbook of qualitative data analysis. 2014. Google Scholar
- Arachchige C, Prendergast LA, Staudte RG. Robust analogs to the coefficient of variation. J Appl Stat. 2022; 49(2):268-290. Google Scholar
- Altman DG. Practical statistics for medical research. 3rd ed. 1999;611. Google Scholar
- Fleiss JL, Levin B, Paik MC. Statistical methods for rates and proportions. Wiley series in probability and statistics. 3rd ed. J. Wiley. 2003;760. Google Scholar
- Brown MB, Forsythe AB. Robust tests for the equality of variances. J Am Stat Assoc. 1974; 69(346):364-367. Google Scholar
- Barrett NJ, Boehmer L, Schrag J. An assessment of the feasibility and utility of an ACCC-ASCO implicit bias training program to enhance racial and ethnic diversity in cancer clinical trials. JCO Oncol Pract. 2023; 19(4):e570-e580. Google Scholar
- Montalban-Bravo G, Huang X, Naqvi K. A clinical trial for patients with acute myeloid leukemia or myelodysplastic syndromes not eligible for standard clinical trials. Leukemia. 2017; 31(2):318-324. Google Scholar
- Pulte D, Fernandes L, Wei G. FDA analysis of ineligibility for acute myeloid leukemia clinical trials by race and ethnicity. Clin Lymphoma Myeloma Leuk. 2023; 23(6):463-470. Google Scholar
- Food and Drug Administration Oncology Center of Excellence: Center for Drug Evaluation and Research and Center for Biologics Evaluation and Research. Cancer clinical trial eligibility criteria: patients with organ dysfunction or prior or concurrent malignancies; guidance for industry. 2020. Publisher Full TextGoogle Scholar
- Food and Drug Administration Oncology Center of Excellence: Center for Drug Evaluation and Research and Center for Biologics Evaluation and Research. Cancer clinical trial eligibility criteria: patients with HIV, hepatitis B virus, or hepatitis C virus infections; guidance for industry. 2020. Publisher Full TextGoogle Scholar
- Kim ES, Bruinooge SS, Roberts S. Broadening eligibility criteria to make clinical trials more representative: American Society of Clinical Oncology and Friends of Cancer Research Joint Research Statement. J Clin Oncol. 2017; 35(33):3737-3744. Google Scholar
Data Supplements
Figures & Tables
Article Information
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.