Research Article Volume 11 Issue 2
Department of Biostatistics & Data Science, University of Kansas Medical Center, USA
Correspondence: Milind A. Phadnis, Department of Biostatistics & Data Science, University of Kansas Medical Center, Kansas City, USA
Received: June 16, 2022 | Published: June 27, 2022
Citation: Phadnis MA, Thewarapperuma N. %PT_GSDesign: A SAS Macro for group sequential designs with time-to-event endpoint using the concept of proportional time. Biom Biostat Int J. 2022;11(2):72-77. DOI: 10.15406/bbij.2022.11.00357
Sequential testing can be used to meet the specific needs of a clinical trial, all while adhering to the study's ethical, financial, and administrative considerations. When the assumption of proportional hazards or exponentially distributed lifetimes is not satisfied, the researcher can rely on the Proportional Time assumption for sample size calculation. The proportional time method has the advantage that previous study results can be used to bolster the current study design and provide an easier interpretation of the treatment benefit by reporting results as an improvement in longevity versus the more traditional interpretation of reduction in risk. This ease in interpretation of treatment benefit helps in raising interest in study participation. This novel method can be applied through a SAS macro and can be utilized for all distributions that belong to the generalized gamma family. The macro incorporates features specific to time-to-event data such as loss to follow-up, administrative censoring, differing accrual times and patterns, binding or non-binding futility rules with or without skips, and flexible alpha and beta spending functions. The macro includes validation for any parameters defined by the user, as well as suggestions for correcting erroneous input. This paper demonstrates two practical applications of the SAS macro with varying design inputs.
Keywords: efficacy, error spending, futility, proportional time, sample size, SAS software
RCT, randomized clinical trial; NDA, new drug application; FDA, food and drug Administration; GSD, group sequential design; PH, proportional hazard; PT, proportional time; AFT, accelerated failure time; GG, generalized gamma
Two-arm randomized clinical trials (RCT) are considered the gold standard by biomedical researchers as they allow estimating how well a new treatment performs relative to a standard-of-care control. Treatments that are found to be promising in a Phase II trial are studied more comprehensively in a Phase III trial where, by enrolling a large number of patients (typically several hundred), researchers aim to investigate the effectiveness and safety of the new treatment against the current standard treatment. If such evidence is found in a Phase III trial, a new drug application (NDA) is submitted to the Food and Drug Administration (FDA) and on obtaining the FDA approval, the new drug becomes the new standard-of-care.
While traditional approaches require the calculation of a fixed sample size in advance of conducting a RCT depending on the type I error, power and clinically important treatment effect, in the medical setting they suffer from the limitation that patients are continually being accrued into a study which may be a time consuming process based on the accrual rate, the availability of qualified patients (based on inclusion/exclusion criteria) and the possibility of random dropouts among many factors. Thus, the primary outcome of interest is not available simultaneously on all patients and researchers may be interested to look at the early results on outcomes on the early enrollees and use that as a basis to decide whether the trial should be continued. This raises the concept of sequential testing in large-sized Phase III trials, where interim results can be used to – (i) stop the trial early for overwhelming evidence of efficacy, (ii) stop the trial early for overwhelming evidence for futility, and (iii) continue the trial for lack of evidence of efficacy or futility.
A Group Sequential Design (GSD) formalizes the above concept by providing a solid statistical framework under which either of the above three decisions can be taken after looking at results collected at interim points in the study. observation window. Ethical, financial, and administrative requirements often guide the statistical designs of GSDs.1–3 Such GSDs have been well developed for continuous and binary outcomes and have a long history starting with quality control application4 and progressing to the medical setting.5 Vast literature is available on this topic in many books6–10 and overview articles.11–13 When dealing with time-to-event outcome, a repeated significance testing approach incorporating a family of designs14–16 can be combined with the error spending method17 to implement a GSD using a log-rank test or by using the proportional hazards (PH) assumption. Popular statistical software often implement GSDs for time-to-event outcome using the weighted and unweighted versions of the log-rank test either explicitly assuming exponentially distributed survival times or with the PH assumption and are able to incorporate complexities of survival outcomes such as random dropouts, prespecified accrual and follow-up times, varying accrual patterns, equal/unequal spaced interim testing points (looks), efficacy-only designs, efficacy and futility designs, binding and non-binding futility rules, and many other flexible features specific to time-to-event outcomes.
When the underlying assumptions that drive the analytical and simulation-based approaches using the framework of the log-rank test are not valid, hardly any alternate methods are available in literature or in standard statistical software. Recent developments in this field have considered relaxing the PH assumption in favor of a ‘proportionality of time’ (PT) assumption leading to development of GSDs in the context of an accelerated failure time (AFT) model.18The authors have described various scenarios in the biomedical setting where their approach could be advantageous compared to the standard methods with the help of real-life examples. Their proposed GSD method provides an alternate approach when the PH assumption is not appropriate and allows various hazard shapes (increasing/decreasing monotonically over time, bathtub shaped, arc-shaped) using the generalized gamma ratio distribution.19 The purpose of this paper is to present a fully functional SAS macro that can be used to implement their GSD method. The SAS macro incorporates multitude of design features specific to a two-arm GSD for time-to-event outcome and the accompanying discussion of results provide information on how this macro can be implemented.
Statistical methods for GSD using the proportional time (PT) framework
The statistical framework for the method proposed based on the PT assumption18 assumes that the survival times follow a generalized gamma (GG) distribution.20The probability density function of the GG distribution is given as:
(1)
where and are the shape parameters, is the scale parameter and is the gamma function defined as . For model fitting purposes a re-parametrization is used to avoid convergence problems using location parameter scale parameter σ and shape parameter λ that generalizes the two-parameter gamma distribution. The density function is given by:
(2)
A complete taxonomy of the various hazard functions for the GG family is explained in literature.21 Briefly, the GG family allows the flexibility of modeling different hazard shapes such as increasing from 0 to ∞ or from a constant to ∞ decreasing from ∞ to 0, or from ∞ to a constant, arc shaped hazards, and bathtub shaped hazards. Special cases of the GG family are (i) two parameter gamma: (ii) standard gamma: for fixed values of λ (iii) Weibull: (iv) exponential: (v) lognormal: (vi) inverse Weibull: (vii) inverse gamma: (viii) ammag: (ix) inverse ammag: .
Concept of proportional time (PT) as a special case of relative time (RT)
For a distribution, we have
(3)
g λ (p) is the logarithm of the pth quantile from the GG (0,1,λ) distribution. The location parameter μ acts as a time-multiplier and governs the values of the median for fixed values of σ and λ resulting in the accelerated failure time (AFT) model. The scale parameter σ determines the interquartile ratio for fixed values of λ and independently of μ. The shape parameter λ determines the GG (0,1,λ) distribution. Together, σ and λ describe the type of hazard function for the GG(0,1,λ) distribution.
The time by which p% of the population experience an event can lead to a statistic called ‘relative times RT(p),’ which can be used to compare survival profiles of patients in different treatment arms (new treatment versus standard treatment). Thus,
(4)
The interpretation of RT(p) is that the time required for p% of individuals in one study arm to experience an event is RT(p) times the time required for p% of individuals in the second study arm. Thus if and denote two different sets of GG parameter values, then
(5)
The manner in which covariates affect RT(p) can be summarized as:
Test Statistic
Based on the discussion above, a test statistic that follows the four-parameter generalized gamma ratio (GGR) distribution can be developed. 20 That is, the parameters of the GG distribution can be used to express RT(p) as:
(6)
Thus, for new treatment to standard treatment allocation ratio we get a test statistic Q that follows the GGR distribution.
(7)
Although this test statistic can be used to calculate the sample size for a two-arm RCT in the case of a fixed study design (a design without any interim testing), when designing a more complex study incorporating all the desired features of a GSD, calculations become more complicated and have to be conducted using a simulation-based approach. The remainder of the paper discusses how the simulation-based GSD method of Phadnis et al.18 can be implemented for a two-arm phase III trial using a SAS macro.
A ten-step algorithm has been detailed in the GSD method of Phadnis et al.18 along with the appropriate formulas for performing sample size calculations. The proposed SAS macro titled PT_GS Design fully implements this algorithm and is written in base SAS and SAS/STAT.22 The various design features available in our macro are summarized below:
NumSimul: Number of simulated samples for the given sample size
alpha: Type I error
sides: 1-sided or 2-sided test
lambda: Shape parameter of the Control Arm using GG distribution
sigma: Scale parameter of the Control Arm using GG distribution
med: User entered Median of the Control Arm using GG distribution
evt_rate: Anticipated event rate for loss-to-follow-up (right censoring)
seed: A random seed is chosen
r: Allocation Ratio: (number in Treatment arm)/ (number in Standard Arm)
Delta_PT_Ha: Under the alternative, PT is greater than 1
a: Accrual time for the study
a_type: Accrual pattern: "1" = Uniform, "2" = Truncated Exponential (parameter omega)
a_omega: Parameter of "2" = truncated exponential distribution: >0 (convex) or <0 (concave); input will only be used with truncated exponential
t: Total time for the study = Accrual time + Follow-up time
bind: Binding futility = 1; Non-binding futility=0
num_look: Total number of looks (including the look at the end-of-study)
look_points: equally spaced looks = 1, unequally spaced looks = 2
alpha_spend: Type of Alpha spending function: 1 = Jennison-Turnbull, 2 = Hwang-Shih-DeCani, 3 = User defined spending
The following datasets are needed to take advantage of the macro’s user defined options.
Default values for the macro parameters have been provided in the text description. Error checks in the code prohibit a user to input impossible values for the macro parameters. For example, where numerical input is required, character values cannot be entered. Likewise, numerical input outside the natural range of a macro parameters are not allowed. If such impossible values are entered, the macro will stop executing and display an error message in
the log window suggesting corrections to the input values.
In addition to the above, the following extra features are provided in the macro:
We have also provided a “README.pdf” file detailing a step-by-step procedure to help users navigate through the process of entering input values. This, along with the full code, is available at https://github.com/thewan05/GSD_SAS_Macro.
These examples were first published under the methodology paper.18 The examples are presented once more so the reader can easily reproduce them. There may be some minor variations, depending on the seed used.
These macro parameters are used to obtain the results for example one:
NumSimul=10000, alpha=0.025, sides=1, lambda=0.5, sigma=0.75, med=20, evt_rate=0.7, seed=1729, r=1, Delta_PT_Ha=1.4, a=12, a_type=1, a_omega=1, t=60, bind=1, num_look=3, look_points=2, alpha_spend=1, rho=1 ,beta=0.10, beta_spend=1, rho_f=1, num_skip=0, maxiter=200, convg=1E3, direct=C:\Users\user1\Desktop
UserDefTime dataset: 3 24 36 60 (Table 1).
Look no. |
Look times |
No. events–H0 control arm |
No. events– H0 treatment arm |
Alpha spent |
Cumul. alpha spent |
Upper Significance boundary (efficacy) GGR Test statistic |
Stop probability under H0 |
Cumul. stop probability under H0 |
Cumul. subject time under H0 |
1 |
24 |
58.21 |
58.25 |
0.00883 |
0.00833 |
1.333 |
0.8053 |
0.8053 |
1992.86 |
2 |
36 |
87.74 |
87.74 |
0.00833 |
0.01667 |
1.259 |
0.1448 |
0.9501 |
2559.65 |
3 |
60 |
109.09 |
109.05 |
0.00833 |
0.025 |
1.219 |
0.0499 |
1 |
2940.79 |
Look no. |
Look times |
No. events – HA control arm |
No. events – HA treatment arm |
Beta spent |
Cumul. beta spent |
Lower significance boundary (efficacy) GGR Test statistic |
Stop probability under HA |
Cumul. stop probability under HA |
Cumul. subject time under HA |
1 |
24 |
58.21 |
40.03 |
0.03173 |
0.03173 |
1.105 |
0.6792 |
0.6792 |
2244.29 |
2 |
36 |
87.74 |
69.12 |
0.03173 |
0.06347 |
1.174 |
0.2136 |
0.8928 |
3093.34 |
3 |
60 |
109.09 |
99.41 |
0.03173 |
0.0952 |
1.219 |
0.1072 |
1 |
3875.05 |
Table 1 GSD - Ovarian CT using proposed method with 10,000 simulations; Pocock plans (efficacy and futility at all looks).
These macro parameters are used to obtain the results for example two:
NumSimul=10000, alpha=0.025, sides=1, lambda=0.5, sigma=0.75, med=20, evt_rate=0.7, seed=1729, r=1, Delta_PT_Ha=1.4, a=12, a_type=1, a_omega=1, t=60, bind=1, num_look=3, look_points=2, alpha_spend=1,
rho =3, beta=0.10, beta_spend=1, rho_f=3, num_skip=0, maxiter=200, convg=1E3, direct=C:\Users\user1\Desktop
UserDefTime dataset: 3 24 36 60 (Table 2).
Look no. |
Look Times |
No. events – H0 control arm |
No. events – H0 treatment arm |
Alpha spent |
Cumul. alpha spent |
Upper significance boundary (efficacy) GGR test statistic |
Stop probability under H0 |
Cumul. stop probability under H0 |
Cumul. subject time under H0 |
1 |
24 |
52.57 |
52.63 |
0.00093 |
0.00093 |
1.457 |
0.4309 |
0.4309 |
1797.58 |
2 |
36 |
79.19 |
79.21 |
0.00648 |
0.00741 |
1.312 |
0.4378 |
0.8687 |
2308.8 |
3 |
60 |
98.42 |
98.4 |
0.01759 |
0.025 |
1.222 |
0.1313 |
1 |
2652.04 |
Look no. |
Look times |
No. events – HA control arm |
No. events – HA treatment arm |
Beta spent |
Cumul. beta spent |
Lower significance boundary (efficacy) GGR test statistic |
Stop probability under HA |
Cumul. stop probability under HA |
Cumul. subject time under HA |
1 |
24 |
52.57 |
36.08 |
0.00352 |
0.00352 |
0.978 |
0.3887 |
0.3887 |
2026.69 |
2 |
36 |
79.19 |
62.34 |
0.02463 |
0.02815 |
1.127 |
0.356 |
0.7447 |
2793.83 |
3 |
60 |
98.42 |
89.7 |
0.06685 |
0.095 |
1.222 |
0.2557 |
1.0004 |
3501.2 |
Table 2 GSD - Ovarian CT using proposed method with 10,000 simulations; O’Brien-Fleming plan (efficacy and futility at all looks)
These macro parameters are used to obtain the results for example three:
NumSimul=10000, alpha =0.025, sides=1, lambda=0.5, sigma=0.75, med=20, evt_rate=0.7, seed=1729, r=1, Delta_PT_Ha=1.4, a=12, a_type=1, a_omega=1, t=60, bind=1, num_look=3, look_points=2, alpha_spend=3, rho=3, beta=0.10, beta_spend=3, rho_f=3, num_skip=0, maxiter=200, convg=1E3, direct=C:\Users\user1\Desktop
UserDefTime dataset: 3 24 36 60
UserDefAlpha dataset: 0.0050 0.0125 0.0250
UserDefBeta dataset: 0.0100 0.0350 0.1000 (Table 3).
Look no. |
Look times |
No. events – H0 control arm |
No. events – H0 treatment arm |
Alpha spent |
Cumul. alpha spent |
Upper significance boundary (efficacy) GGR Test statistic |
Stop probability under H0 |
Cumul. stop Probability under H0 |
Cumul. subject time under H0 |
|
1 |
24 |
54.32 |
54.32 |
0.005 |
0.005 |
1.384 |
0.5758 |
0.5758 |
1859.49 |
|
2 |
36 |
81.82 |
81.83 |
0.0075 |
0.0125 |
1.279 |
0.3191 |
0.8949 |
2388.16 |
|
3 |
60 |
101.73 |
101.73 |
0.0125 |
0.025 |
1.224 |
0.1043 |
0.9992 |
2743.29 |
|
Look no. |
Look times |
No. events – HA control arm |
No. events – HA treatment arm |
Beta spent |
Cumul. beta spent |
Lower significance boundary (efficacy) GGR Test statistic |
Stop probability under HA |
Cumul. stop probability under HA |
Cumul. subject time under HA |
|
1 |
24 |
54.32 |
37.35 |
0.00922 |
0.00922 |
1.022 |
0.5416 |
0.5416 |
2094.91 |
|
2 |
36 |
81.82 |
64.53 |
0.02305 |
0.03227 |
1.135 |
0.2776 |
0.8192 |
2887.1 |
|
3 |
60 |
101.73 |
92.82 |
0.05933 |
0.0922 |
1.224 |
0.1798 |
0.999 |
3615.75 |
Table 3 GSD - Ovarian CT using proposed method with 10,000 simulations; user-defined alpha and beta spending (efficacy and futility at all looks)
These macro parameters are used to obtain the results for example four:
NumSimul=10000, alpha=0.025, sides=1, lambda=1, sigma=1, med=1, evt_rate=1, seed=1729, r =1, Delta_PT_Ha=1.75, a=1, a_type=1, a_omega=1, t=4, bind=1, num_look=4, look_points=1, alpha_spend=2,
rho=1, beta=0.20, beta_spend=2, rho_f=1, num_skip=2, maxiter=200, convg=1E3, direct=C:\Users\user1\Desktop (Table 4).
Look no. |
Look times |
No. events – H0 control arm |
No. events – H0 treatment arm |
Alpha spent |
Cumul. alpha spent |
Upper significance boundary (efficacy) GGR Test statistic |
Stop probability under H0 |
Cumul. stop probability under H0 |
Cumul. subject time under H0 |
1 |
1 |
22.3 |
22.3 |
0.00875 |
0.00875 |
2.218 |
0.0088 |
0.0088 |
32.156 |
2 |
2 |
51.23 |
51.17 |
0.00681 |
0.01556 |
1.632 |
<0.0001 |
0.0088 |
73.726 |
3 |
3 |
65.58 |
65.61 |
0.00531 |
0.02087 |
1.522 |
0.9837 |
0.9922 |
94.496 |
4 |
4 |
72.79 |
72.84 |
0.00413 |
0.025 |
1.452 |
0.0077 |
0.9999 |
104.891 |
Look no. |
Look times |
No. events – HA control arm |
No. events – HA treatment arm |
Beta spent |
Cumul. beta spent |
Lower significance boundary (efficacy) GGR Test statistic |
Stop probability under HA |
Cumul. stop probability under HA |
Cumul. subject time under HA |
1 |
1 |
22.3 |
13.93 |
0 |
0 |
- |
0.2674 |
0.2674 |
35.198 |
2 |
2 |
51.23 |
35.59 |
0 |
0 |
- |
0.3524 |
0.6198 |
89.716 |
3 |
3 |
65.58 |
50.14 |
0.16268 |
0.16268 |
1.45 |
0.3082 |
0.928 |
126.351 |
4 |
4 |
72.79 |
59.94 |
0.03222 |
0.1949 |
1.453 |
0.0722 |
1.0002 |
150.989 |
Table 4 GSD output for exponential distributed data using proposed method with 10,000 simulation (two futility skips)
A GSD is generally implemented as a large sample Phase III trial and therefore provides an opportunity to incorporate information obtained from a preceding moderate-sized Phase II study. In our paper, we have built a SAS macro that implements a GSD incorporating various design features specific to time-to-event outcome in the case of non-proportional hazards. While earlier methods using the nonparametric log-rank test or the PH assumption are available in standard statistics software, our macro is the first of its kind in implementing a GSD in the non-PH case using a three-parameter GG distribution. The macro fully executes the method based on the PT assumption18 and thereby offers researchers an additional option in designing Phase III trials for the non-PH case. Some of the advantages of using the macro are - it handles different types of hazard shapes, utilizes Phase II data to ensure that early interims are not conducted with too few events, is simulation based and does not depend on any asymptotic normality of the test statistic, and most importantly provides clinical meaningful and easy-to-interpret efficacy and/or futility boundaries based on the concept of improvement in longevity. Due to this direct interpretation of "treatment effect" as an improvement in survival time, we hope that researchers working in this area will find our SAS macro to be of practical value in implementing a GSD for Phase III time-to-event trials.”
The High performance computing capabilities, which were used to conduct some of the analyses described in this paper, were supported in part by the National Cancer Institute (NCI) Cancer Center Support Grant P30 CA168524; the Kansas IDeA Network of Biomedical Research Excellence Bioinformatics Core, supported by the National Institute of General Medical Science award P20 GM103418; and the Kansas Institute for Precision Medicine COBRE, supported by the National Institute of General Medical Science award P20 GM130423.
The authors declare no conflicts of interest.
©2022 Phadnis, et al. This is an open access article distributed under the terms of the, which permits unrestricted use, distribution, and build upon your work non-commercially.
2 7