Some notable properties of the standard oncology phase I design
We identify three properties of the standard oncology phase I trial design or 3 + 3 design. We show that the standard design implicitly uses isotonic regression to estimate a maximum tolerated dose. We next illustrate the relationship between the sta…
Authors: ** Gregory J. Hather (University of California, Berkeley, Department of Statistics) Howard Mackey (Genentech Inc.) **
Some notable prop erties of the standard oncology phase I design Gregory J. Hather Departmen t of Statistics, Univ ersit y of California, Berk eley , California, USA Ho w ard Mac k ey Genen tec h Inc., South San F rancisco, California, USA Octob er 22, 2018 Abstract W e iden tify three prop erties of the standard oncology phase I trial design or 3 + 3 design. W e show that the standard design implicitly uses isotonic regression to estimate a maximum tolerated dose. W e next illustrate the relationship b et ween the standard design and a Ba yesian design prop osed by Ji et al. (2007). A slight mo dification to this Ba yesian design, under a particular model sp ecification, w ould assign treatments in a manner iden tical to the standard design. W e finally present calculations rev ealing the behavior of the standard design in a worst case scenario and compare its b eha vior with other 3 + 3-like designs. 1 In tro duction The main goal of an oncology phase Ia clinical trial is to assess the safety of a drug which has not yet b een tested in h umans (Arbuc k, 1996). A well designed oncology phase I trial should yield enough information to determine a safe dose, or range of safe doses, to use in further trials, while maintaining a reasonably lo w level of risk to the patients in the study . The dose or doses to b e used for further study should b e low enough to b e safe for most patien ts, but high enough to be efficacious, since higher doses are often more effectiv e. While the ‘more is better’ paradigm is well accepted in the case of chemotherapeutic agents, it is not clear that this paradigm should 1 hold for new er, targeted, non-c hemotherap eutic cancer agents. That is not to sa y that man y or most of the newer molecularly targeted agen ts are not more efficacious in higher doses, but that it is not a given that the ‘more is b etter’ paradigm alw ays holds. The primary scien tific question in an oncology phase Ia study dictates to some extent the t yp e of patien t who w ould enroll in suc h a study . P atients who hav e standard of care treatment a v ailable to them are less likely to participate in a phase I oncology study , and as a result, the patient p opulation is often a heterogeneous group of patients with different types of late stage cancer. Differen t types of cancer often suggest different risk-b enefit tolerances by b oth the patien t and the treating clinician. This may result in a difficult y in selecting a trial design due to the existence and v alidity of m ultiple risk-b enefit ratios. F or example, a patient with metastatic pancreatic cancer may b e willing to risk more toxicit y than a patient with newly diagnosed late stage prostate cancer. This may res ult in the uncomfortable situation where the same adverse reaction has differen t implications for future developmen t dep ending on the type of disease the patien ts has. In general, Phase I proto cols report so called, dose-limiting to xicities (DL Ts), irresp ective of the type of disease the patien t has. Additionally , the developers of a new oncology therapy may not kno w, before the first h uman data is generated in phase Ia, all the types of cancer to target in future phase I I and I II studies. This ma y b e due to unexp ected, dramatic results in phase Ia and/or changes in the financial resources of the en tit y developing the therapy . Another complicating reality is that assigning attribution of patient outcomes to the therapy under in vestigation, a patien t’s cancer, or a patient’s concomitan t medication is not an exact science. This is esp ecially true when therap eutic agents are b eing tested for the first time in humans. Early on in testing, it is not uncommon for a patient’ s outcome to b e deemed a DL T only after a n um b er of other patien ts hav e exp erienced the same outcome and/or degree of severit y . An outcome determined to b e a DL T after patien ts ha ve started treatment in higher dose levels may result in a protocol sp ecified action whic h is different than the action already tak en had the DL T b een identifier earlier. These realities are rarely discussed in the literature for consideration of a phase I design, whic h further complicates the mission of designing the ‘b est’ phase Ia trial. Unlik e the phase I I or I II paradigm, where the ob jective is to assess efficacy while obtain- ing v aluable safet y information, phase I studies often necessitate the administration of unsafe amoun ts of drug to some patients in order to determine a maximum tolerable dose, or range of safe doses. And as in all clinical studies, the trial should not inv olv e to o many patien ts or take 2 to o long to complete. One can find many metho ds for dose finding in the literature, the “stan- dard design” (Storer, 1989) b eing the oldest of the commonly emplo y ed phase I designs. Other designs con tained in the literature include con tinual reassessmen t (O’Quigley et al., 1990), ran- dom walk (Durham and Flourno y , 1994), escalation with ov erdose con trol (Babb et al., 1998), and cumulativ e cohort designs (Iv anov a et al., 2007). F or a more extensive review of v arious oncology phase I trial designs, the reader is referred to Rosenberger and Haines (2002), P otter (2006), and Koyfman et al. (2007). It should b e noted that the standard design remains by far the most commonly used in the phase Ia oncology setting. This is lik ely due to the ease with whic h clinical centers are able to carry out the design, its simplicity , historical p erformance, and freedom from model assumptions. W e hav e observ ed in the literature that comparisons made with new er comp eting designs, almost alwa ys using simulations, are usually p erformed with assumptions that do not reflect some common uses of the 3 + 3 design. One of these assumptions has to do with dose levels b eing fixed a priori at the b eginning of the study . F or phase Ia studies espe cially , clinical in vestigators are usually not comfortable setting dose escalations b efore an y h uman data are a v ailable. They prefer to fix the first dose lev el only and to hav e future dose escalations b e determined b y the accruing to xicity observ ed in the phase Ia. This is rightly due to the fact that pre-clinical toxicit y information can p erform p oorly at predicting human toxicities, esp ecially for new er targeted small molecule therap eutics. This enhancemen t to the 3 + 3 w as in tro duced b y Simon at al. (1997) as ac c eler ate d titr ation , how ever its original incarnation referred to in trapatient escalations. The authors’ motiv ation for mentioning this fact is due to the b elief that the 3 + 3, while p ossessing prop erties whic h would cause a formally trained statistician some discomfort (lack of an estimand to name one), it has served drug developmen t relativ ely w ell. W e make this commen t in the con text of new drugs that ha ve y et to b e tested in man, the phase Ia setting. W e b eliev e that progress in drug developmen t metho ds will b e surer as the professed merits and deficiencies of the 3 + 3 are more fairly assessed b y accounting for more curren t clinical practice. Recen tly , Ji et al. (2007) prop osed a design similar to the cum ulative cohort design, whic h uses a Ba y esian mo del to describ e the rate of toxicities at each dose. In this design, decisions ab out future doses are based on the p osterior distributions of toxicit y rates at curren t doses. These p osterior distributions are a function only of the num b er of patients treated and the n umber of patien ts with to xicities at the current dose. Ji et al. noted that the decision rules 3 for their design could b e display ed in a monitoring table, where the columns corresp ond to the n umber of patients treated at the current dose lev el, and the ro ws corresp ond to the n umber of DL Ts at the current dose level. Their work suggested new w a ys of thinking ab out phase I trial designs whic h inspired our in vestigation of the relationship b etw een the standard design and their Bay es design. In the following sections, we will show that the standard design implicitly uses isotonic regression to estimate a maximum tolerated dose, is related to a sp ecial case of the Ji, Li, Bekele design, and p ossesses an analytical expression for the upp er b ound on the probabilit y of selecting an unsafe final dose. 2 Bac kground 2.1 T erms In the protocol of an oncology phase Ia study , a DL T will be defined as any of the pre-sp ecified ‘dose limiting’ adverse even ts that hav e been determined b y the in v estigator to b e related, or p ossibly related, to the drug and which o ccur within a sp ecified temp oral windo w. It should b e noted that regulatory agencies are conserv ative with respect to assessing the safet y of new therap eutic agents. As a result, an y ev ent o ccurring during the course of the DL T window that cannot b e ruled out as being related to the agent is considered a DL T. These pre-sp ecified ‘dose limiting’ adv erse even ts are defined as adv erse reactions considered to be sev ere enough to limit a patient from further exp osure to the exp erimen tal treatmen t. These definitions v ary from study to study and are based up on the judgment of the authors of the proto col. A maxim um tolerated dose (MTD) is the highest dose that is considered safe in patien ts. The standard design contains a heuristic, and the MTD is defined as the dose that is yielded by applying the heuristic. While the standard design yields an estimator for the MTD, an explicit estimand is not defined. Most other designs, including the Ji, Li, Bekele metho d, do define an estimand as w ell as its estimator for the MTD, the estimand b eing the dose at whic h a pre-sp ecified fraction of patien ts exp erience a DL T in a w ell defined patien t p opulation. In this article, we use the term “cohort” to refer to a group of patients treated concurrently and “dose group” to refer to all the patien ts who receive a particular dose of drug. 4 2.2 The standard design The standard design, also commonly referred to as the 3+3 design, dictates either fully specified or partially sp ecified dose lev el increases. A fully sp ecified escalation sc heme sets a starting dose lev el and successively increases in future dose levels. F or example, dose lev el increases could either double the previous level or follow a mo dified Fib onacci sequence (Om ura, 2003). A par- tially sp ecified escalation scheme also sp ecifies a starting dose lev el, how ever, future dose lev els sequen tially dep end on the n umber and nature of to xicities exp erienced and/or pharmacokinetic or pharmaco dynamic measures. In this article, w e consider the fully specified escalation scheme with a pre-specified n umber of dose levels. The standard design is defined as having cohort sizes of 3 patients, where no dose level has more than 2 cohorts. This design b egins treating patien ts at the lo w est dose level. If none of the 3 patients in the first cohort exp erience a DL T, the next cohort of patien ts gets treated at the next higher dose lev el. If 1 of the 3 patien ts exp eriences a 1 DL T, the design assigns another cohort at the same dose level. If, for some dose lev el, 2 or more (of 3 or 6) patients exp erience DL Ts, then that dose level is considered unsafe and is not revisited again during the study , and the next cohort must be treated at the next low er dose. The trial ends when at most 1 patient in 6 experiences a DL T, and 2 or more (of 3 or 6) patients experience DL Ts at the next highest dose. The dose group that con tains patients treated at the highest dose where at most 1 of 6 patien ts exp erience a DL T is determined to b e the MTD. There is not a standard approach for addressing the scenario where the low est dose level observ es a DL T rate of more than 2 out of 6, but t w o common approaches are to treat the next cohort at some reasonable fraction of the starting dose or to simply close do wn the study . T able 1 con tains these rules of the standard design based upon the num b er of DL Ts. While this design enjoys m uch popularity among in v estigators studying new cancer therapies, it does suffer from the previously men tioned defect that a well defined target, the estimand, is not kno wn. In general, clinical in v estigators b eliev e that the standard design determines the dose whic h w ould cause b et ween 17% (1/6) and 33% (2/6) of patients to exp erience a DL T. This b elief is not completely misguided if one only considers the information con tained in patien ts treated at the determined MTD to b e relev an t. Ho wev er, a more sophisticated view would consider information contained at all tested dose levels. Under this p erspective, the dose level that is estimated b y the design is completely dep enden t on the true unknown DL T / dose relationship. F or example, the standard design may yield an unbiased estimator of the dose whic h causes 20% 5 of patients to exp erience a DL T for one DL T / dose relationship v ersus 30% under a differen t DL T / dose relationship. It has b een rep orted in the literature that the standard design had targeted doses corresp onding to a 23% to 28% DL T rate in 22 phase I studies (Smith et al., 1996). In addition, Lin and Shih (2001) reported that the standard design targeted DL T rates of 19 to 29% based on 3 distinctly differen t dose / DL T relationships. Storer (2001) rep orts that the standard design is an appropriate design when interest is in identifying a dose whic h corresp onds to a 20 to 25% DL T rate. It should b e noted that while the standard design requires only 6 patien ts treated at a particular dose to determine the MTD, arguably too small a num b er to gain a reasonable understanding of an agen t’s toxic prop erties, common practice has b ecome to treat an extra 12 to 18 patien ts at the MTD in order to gain more exp erience b efore designing a phase I I clinical trial. An examination of ho w this practice affects the prop erties of this augmented standard design has not b een examined in the literature to the authors’ kno wledge, and thus w e will not discuss this case any further for the purposes of this article. 2.3 The Ji, Li, Bek ele metho d The Ji, Li, Bek ele metho d uses a fully sp ecified dose escalation scheme, and lik e the standard design, starts at the low est dose and escalates dose levels by at most one dose level at a time. The decision for whether to escalate, sta y the same, or de-escalate is based on the num b er of patients and the num b er of DL Ts at the current dose. A to xicity exclusion rule preven ts dose escalation if the next higher dose is estimated to b e unacceptably toxic based on patient outcomes at that dose. Cohorts of any size can be sp ecified and the stopping rule is based on a fixed pre-sp ecified sample size, with the one exception of early stopping when the starting dose is found unacceptably toxic. Before we describe the metho d in detail, w e w ould like to in tro duce some notation. First, w e will use the integer i to lab el the dose levels. W e define p i to b e the true rate of DL Ts at dose lev el i . W e let p denote a v ector con taining the v alues p i for all the dose lev els. Also, the Ji, Li, Bek ele metho d requires a user-sp ecified targeted ratio of DL Ts whic h we shall denote b y p T . Using data that b ecomes av ailable during the course of the study and a Bay esian framework with an assumed prior on p , Ji, Li, and Bekele prop ose calculating a probability density for the probability p i seeing a DL T at the current dose i . The unit interv al can b e divided into { (0 , p T − K 1 σ i ) , [ p T − K 1 σ i , p T + K 2 σ i ] , ( p T + K 2 σ i , 1) } , where σ i is the standard deviation of 6 the p osterior densit y of p i , and K 1 and K 2 are parameters chosen by the researcher. If the first in terv al has the most p osterior probability mass, then the dose is increased. If the second in terv al has the most p osterior probabilit y mass, then the dose is k ept the same. If the third in terv al has the most posterior probabilit y mass, then the dose is de-escalated. T o better control the risk of treating patien ts at unsafe levels, Ji et al. included a toxicit y exclusion rule, where one computes P ( p i > p T ) conditional on the observ ed data. If this probabilit y is greater than some user defined quan tity ξ , then the dose level i and all higher doses are considered to b e unacceptably toxic and those levels will not b e visited again. If the curren t dose is one lev el b elow an excluded dose, then the action is based on which of the first t wo interv als has the most p osterior probability mass. The trial ends when the total num b er of patients used in the study reac hes a pre-sp ecified num b er or if the starting dose is found unacceptably toxic. A t the end of the trial, the data and the prior are used to estimate the DL T rate E ( p i ) = ˆ p i at each dose level not ruled out by the toxicit y exclusion rule. Next, isotonic regression is p erformed on the v ector of estimated DL T rates, ˆ p . The isotonic regression pro cedure uses the weigh t σ 2 i at each dose lev el i and pro duces a new estimate p ∗ for the DL T rates that is a monotonically increasing sequence. The estimated MTD is the dose i at which | p ∗ i − p T | is minimized. If some doses tie for the smallest difference, and if the mean of p ∗ i among the ties is less than p T , then the highest dose among the ties is selected. Otherwise, the low est dose among the ties is selected. 3 Isotonic regression and the standard design The isotonic regression estimator for the MTD has b een prop osed by Leung and W ang (2001), St ylianou and Flourno y (2002), Iv ano v a et al. (2003), and Ji et al. (2007). In simulations b y Iv anov a et al., an isotonic regression estimator outp erformed b oth an empirical estimator and a parametric maximum likelihoo d estimator for the MTD. W e will no w sho w that the standard design’s metho d for estimating the MTD is equiv alent to an estimator inv olving isotonic regression. In general, isotonic regression is a nonparametric regression tec hnique that yields a fit to a v ector which minimizes the residual sum of squares, sub ject to the constraint that the fitted v alues constitute a monotonically increasing (or decreasing) sequence (Rob ertson et al., 1998). 7 In other words, if y is a vector to be fit by isotonic regression, then the result would b e a v ector z ∗ = min z || y − z || 2 sub ject to the constraint that z i ≤ z j for all i < j . One basic prop ert y of isotonic regression is that if there is some i for which y j ≥ y i for all j greater than i and y j ≤ y i for all j less than i , then the fit at i is z ∗ i = y i . Isotonic regression is an appealing to ol in the Phase I trials setting b ecause the assumption of toxicit y increasing with dose is usually quite reasonable. After ending a Phase I trial, one can apply isotonic regression to the vector ˆ p = ( t 1 /n 1 , t 2 /n 2 , . . . , t D /n D ) using the weigh ts ( n 1 , n 2 , . . . , n D ) where t i is the n umber of DL Ts seen at dose level i , n i is the n umber of patients treated at dose lev el i , and D is the total n umber of dose levels considered. These v alues and weigh ts for isotonic regression ha ve b een used by other researchers to estimate the dose to xicity curve (Stylianou and Flourno y , 2002), in whic h case, the isotonic regression estimator is equiv alent to the order restricted nonparametric maximum lik eliho o d estimator (Sun, 1998). After finding ˆ p , there are sev eral wa ys to estimate the MTD. Here, w e consider the approac h used b y Stylianou and Flournoy (2002), where the MTD estimate is the largest dose with an estimated toxicit y not exceeding a preset lev el. Recall that when a trial using the standard design ends, the estimated MTD is the highest dose for which 6 p eople were treated and no more than 1 p erson exp erienced a DL T. W e shall call this particular dose level d . By the nature of the standard design rules, t i /n i ≤ 1 / 6 for i ≤ d and t i /n i ≥ 1 / 3 for i > d . It now follo ws that the isotonic regression estimate of the toxicit y rate will be no more than 1 / 6 for i ≤ d and at least 1 / 3 for i > d . Thus, if we set the target level p T to any v alue within the range [1 / 6 , 1 / 3), the dose level d will b e the largest dose with an estimated to xicit y not exceeding p T . Th us, for 1 / 6 ≤ p T < 1 / 3, this isotonic regression estimator of the MTD is equiv alent to the MTD as defined by the standard design. 4 Dose assignmen t for the Ji, Li, Bek ele metho d and the standard design Ji, Li and Bek ele note that when the priors used in their metho d are iden tical and independent among dose lev els, then the action to be taken with respect to treating future patients dep ends on the cumulativ e num b er of patients and the cum ulativ e n um b er of DL Ts in the current dose group. As suc h, the b eha vior of the design can be describ ed by a trial monitoring table. In Ji et al. (2007), the trial monitoring table has columns corresponding to the n umber of patient s 8 treated at the current dose lev el and rows corresponding to the n umber of DL Ts at the current dose level. These elements completely sp ecify future action in all cases where the next higher dose level has not been found to hav e unacceptable to xicit y . With the standard design, future action also dep ends on the n um b er of patients and the n umber of DL Ts seen in the curren t dose group. Likewise, the standard design rules in T able 1 can also b e describ ed b y a trial monitoring table, as in T able 2. W e found that when the Ji, Li, Bekele metho d is used with cohorts of size 3, and parameters K 1 = 1, K 2 = 0 . 1, p T = 0 . 17, ξ = 0 . 7, and prior B (0 . 005 , 0 . 005) for example, a monitoring table identical to T able 2 was yielded. A range of parameter v alues close to the ones ab o ve also produced the same monitoring table. Thus, the actions during the trial are identical for b oth designs when the maxim um sample size in the Ji, Li, Bek ele method is replaced by a stopping rule of 1 DL T out of 6 patients with the next higher dose found to ha ve unacceptable to xicity . That is, patients enrolled in either study w ould b e assigned treatment in exactly the same wa y . In addition, while the MTD estimates of the tw o designs differ in general, they are iden tical for our c hoice of parameters. This is b ecause all the doses not excluded would ha v e ˆ p i less than or equal to 1/6. Th us, all elements of p ∗ w ould b e less than or equal to 1/6, which is less than p T = . 17. Therefore, the highest non-excluded dose would b e selected, which would b e the same dose selected b y the standard design. W e w ould like to suggest that a mo dification to the fixed stopping rule in the Ji, Li, Bek ele metho d w ould not alter its spirit. Moreov er, we exp ect that a more data driv en stopping rule w ould b e viewed as an enhancement b y practitioners. As an example, if a dose w as found to ha ve unacceptable toxicit y early under their design, it may dictate treating a large n um b er of patien ts at the next low er dose level, regardless of how muc h information has b een accum ulated at that lev el. A data driven stopping rule migh t mitigate such an issue. 5 The standard design and its p oten tial to select an unsafe MTD There are many wa ys to assess how a clinical trial design would p erform in terms of patient safet y . Here, w e consider the standard design in terms of the underlying toxicit y rate at the estimated MTD. While researc h exists with resp ect to this asp ect in the form of p oten tially useful expressions (Reiner et al., 1999; Lin and Shih, 2001), our approac h is to consider a w orst 9 case dose to xicity scenario. In particular, we consider a dose toxicit y curve that maximizes the probability that the underlying toxicit y rate at the estimated MTD is at or ab o v e a given v alue. This consideration yields a simplified expression which can b e used to, graphically or in tabular form, examine the b eha vior of the standard design, or mo difications to it. As a result, additional insight ab out the p erformance of the standard design can b e gleaned along with p ossible comparisons to other designs whic h are minor mo difications to the standard design. These mo difications could include changes in cohort size and/or the decision rule to expand or escalate/de-escalate, for example. While some of these mo difications are not used in practice, w e introduce them only to illustrate that it is relatively simple to exp erimen t with differen t design p ossibilities and examine their op erating characteristics under this worst case. Before proceeding, w e in tro duce some notation. Let P (DL T MTD ) be the underlying toxicit y rate at the estimated MTD. Let r ( v ) b e the probabilit y that the standard design would pick an MTD for whic h P (DL T MTD ) is greater than or equal to some v alue v . Note that here, v is not mean t to con vey a target rate of DL Ts, as suc h a quantit y is not defined for the standard design. Rather, v is an y DL T rate of general interest. F or example, if a clinical inv estigator felt that a DL T rate of 30% represen ted an unacceptably high DL T rate for a future cancer therapeutic in a particular cancer patien t population, the inv estigator w ould be interested in the case where v = 0.30. W e wish to find an upp er b ound for r ( v ). In order for r ( v ) to be nonzero, the dose toxicit y curv e must hav e some dose level d for which p i ≥ v when i ≥ d and p i < v when i < d . T o maximize r ( v ), one must maximize the probabilit y that at least one cohort is treated at dose lev el d . One must also maximize the probabilit y that the dose level do es not de-escalate b elo w d giv en that at least one cohort is treated at dose lev el d . Thus, r ( v ) is maximized when p i = 0 for i < d and p i = v for i ≥ d . Note that r ( v ) is not a complementary cumulativ e distribution function b ecause the dose to xicity curv e that maximizes r ( v ) is different for each v alue of v . Using the rules of the standard design, App endix A sho ws that r ( v ) = 1 − 3 v (1 − v ) 2 (1 − (1 − v ) 3 ) + (3 v 2 (1 − v ) + v 3 ) 1 − (1 − v ) 3 (3 v 2 (1 − v ) + v 3 ) W e can also dev elop analogous expressions for mo difications to the standard design. Consider the follo wing hybrid 1+2+3/3+3 design which is a v ariation of the tw o-stage design describ ed in Storer (2001). One patien t is enrolled at the starting dose level, and if a DL T is not observed, 10 then another patient is treated at the next highest dose level. This pro cess contin ues until one patien t exp eriences a DL T, after which 2 more patien ts are treated at that dose, thereafter the standard design is used for the rest of the trial. The cohort sizes would then be 2 or 3, so as to mak e the dose group size divisible by 3 as in the standard 3 + 3 design. One can also consider other mo difications to the standard design where the num b er of DL Ts that dictate the next dose lev el to inv estigate are the same as that of the 3 + 3 but cohorts of size 2 or 4 are enrolled instead of cohorts of size 3. Of course, such designs will on a v erage select differen t MTDs with differen t underlying DL T rates and will lik ewise hav e differing worst case scenarios. In Figure 1, w e compare the b ehavior of these worst case scenarios which are the maximum probabilit y of choosing an MTD with an underlying to xicit y rate at or ab ov e v as a function of v . Of the 4 designs describ ed ab o ve, the 4+4 design app ears to b e the safest by this measure and 2+2 the least safe. In terestingly , the curve for the 1+2+3/3+3 design falls b et w een the curv e for the 2+2 design and the 3+3 design. It is also in teresting to note that while the 2+2 is virtually indistinguishable from the 1+2+3/3+3 through probabilities up to 0.2, they become quite different at 0.6. In addition to comparing the general b eha vior of different designs, these calculations may help guide in the choice of design for a sp ecific phase I trial. F or example, consider a trial in volving a new cancer therapeutic that has shown promise in animal studies but which has the p oten tial to cause sev ere DL Ts based up on the mechanism of action of the drug. In this scenario, in vestigators would likely pro ceed cautiously with respect to how they treat the first patien ts with this drug and the manner in whic h they escalate dosing. In this case, oncologists may b e particularly interested in an upp er bound on the probability of selecting an unsafe MTD. If 0.25 is considered to b e an unacceptably high rate of to xicity in this situation, we can see that the 3 + 3 design has, at most, a 57% chance of selecting an unacceptably high MTD whereas the 1+2+3/3+3 has, at most, a 74% chance. This observ ation, along with the designs’ b eha vior in other scenarios, would presumably help inform the study design choice. Alternatively if we consider a cancer therap eutic with future plans to treat patien ts with late stage cancer where a more mo derate rate of toxicit y is acceptable, 0.35 may be considered to b e the threshold for what is considered unacceptably high. If w e consider the alternativ e concern of selecting a dose that is to o low as the MTD, Figure 1 sho ws that the 4+4 design alwa ys has at least a 30% chance of choosing an MTD with a DL T rate b elo w 0.15, ev en though higher doses are acceptable in these circumstances. This result ma y weigh against using such a design due to its relativ ely 11 high risk of choosing an MTD that is to o lo w, which could negatively impact efficacy . 6 Discussion W e hav e shown that the standard design implicitly performs isotonic regression to estimate the MTD. In addition, the rules of the standard design can be describ ed using a trial monitoring table of the form describ ed by Ji et al. (2007). Our results provide a more general wa y of viewing the standard design, which we hop e will encourage further dev elopmen t and examination of oncology phase Ia designs while accounting for the realities of current oncologic practice. F or example, one could c hange or add entries to the trial monitoring table (T able 2) or c hange the stopping rule to allo w for more cohorts p er dose level. One could also change the target rate for the isotonic regression or prop ose a completely new metho d for assigning treatmen t or determining an MTD. Drug developers and researc hers are certainly ready and willing to consider new trial designs pro vided they truly impro ve existing metho ds. W e ha ve also illustrated the relationship b et ween the 3 + 3 design and the Ba yesian design in tro duced b y Ji, Li, and Bekele, the relationship b eing that the 3 + 3 can b e seen as a sp ecial case of the their Bay esian metho d with a mo dification to their stopping rule. In addition, we were able to analytically identify the upp er bound for the probability of c ho osing an unsafe MTD with the 3 + 3 and similar designs. Similar analyses could straight- forw ardly b e performed on other lik e designs lik e Storer’s b est-of-fiv e design (Storer, 2001) for example. Examinations of these probability limits on more adv anced designs, such as the Ji, Li, Bek ele metho d, may pro vide useful insigh ts, although it is not immediately ob vious how to do so. The authors w ould like to also note that there are certainly more p erspectives on how to assess the op erating characteristics of a phase I trial design, as has b een done in the literature. A common one b eing the exp ected num b er of patients treated at doses with high toxicit y . One could apply the framew ork presented here, considering the worst case scenario to find an upp er b ound on the expected n umber of patients treated at doses for which the probability of to xicit y w as at least v . 12 7 Ac kno wledgmen ts W e are grateful to Mei Polley for referring us to the work of Ji, Li, and Bekele. W e also thank W ei Y u, Ron Y u, Da vid Hiller, and Grazyna Lieb erman for their ideas and suggestions. References Arbuc k, S. G. (1996). W orkshop on Phase I study design: Ninth NCI/EOR TC New Drug Dev elopment Symposium, Amsterdam, March 12, 1996. Ann. Onc ol. 7:567-573. Babb, J., Rogatko, A., Zac ks, S. (1998). Cancer phase I clinical trials: Efficient dose escalation with ov erdose con trol. Stat. Me d. 17:1103-1120. Durham, S. D., Flournoy , N. (1994). Random w alks for quantile estimation. In: Gupta, S. S., Berger, J. O., eds. Statistic al De cision The ory and R elate d T opics V . New Y ork: Springer, pp. 467-476. Iv anov a, A., Flourno y , N., Chung, Y. (2007). Cumulativ e cohort design for dose-finding. J. Stat. Plan. Infer. 137:2316-2327. Iv anov a, A., Montazer-Haghighi, A., Mohan t y , S. G., Durham, S. D. (2003). Impro ved up-and- do wn designs for phase I trials. Stat. Me d. 22:69-82. Ji, Y., Li, Y., Bek ele, N. (2007). Dose-finding in phase I clinical trials based on toxicit y proba- bilit y interv als. Clin. T rials 4:235-244. Ko yfman, S. A., Agraw al, M., Garrett-May er, E., Krohmal, B., W olf, E., Emanuel, E. J., Gross, C. P . (2007). Risks and benefits asso ciated with nov el phase 1 oncology trial designs. Canc er 110:1115-1124. Leung, D. H. Y., W ang, Y. G. (2001). Isotonic Designs for Phase I T rials. Contr ol le d Clin. T rials 22:126-138. Lin, Y., Shih, W. J. (2001). Statistical properties of the traditional algorithm-based designs for phase I cancer clinical trials. Biostatistics 2:203-215. Om ura, G. A. (2003). Mo dified Fib onacci Searc h. J. Clin. Onc ol. 21:3177. 13 O’Quigley , J., P ep e, M., Fisher, L. (1990). Contin ual reassessment metho d: a practical design for phase I clinical trials in cancer. Biometrics 46:33-48. P otter, D. M. (2006). Phase I Studies of Chemotherap eutic Agents in Cancer Patien ts: A Review of the Designs. J. Biopharm. Stat. 16:579-604. Reiner, E., P aoletti, X., O’Quigley , J. (1999). Op erating characteristics of the standard phase I clinical trial design. Comput. Stat. Data Anal. 30:303-315. Rob ertson, T., W righ t, F. T., Dykstra, R. L. (1988). Or der R estricte d Statistic al Infer enc e . Chic hester: Wiley . Rosen b erger, W. F., Haines, L. M. (2002). Comp eting designs for phase I clinical trials: a review. Stat. Me d. 21:2757-2770. Smith, T. L., Lee, J. J., Kantarjian, H. M., Legha, S. S., Rab er, M. N. (1996). Design and results of phase I cancer clinical trials: three-y ear exp erience at MD Anderson Cancer Center. J. Clin. Onc ol. 14:287-295. Simon, R. (1997). Accelerated titration designs for phase I clinical trials in oncology . J. Natl. Canc er. Inst. 89:1138-1147. Storer, B. (1989). Design and analysis of Phase I clinical trials. Biometrics 45:925-937. Storer, B. (2001). An ev aluation of phase I clinical trial designs in the contin uous dose-resp onse setting. Stat. Me d. 20:2399-2408. St ylianou, M., Flourno y , N. (2002). Dose finding using the biased coin up-and-do wn design and isotonic regression. Biometrics 58:171-177. Sun, J. (1998). Interv al censoring. Encyclop e dia of Biostatistics . New Y ork: John Wiley & Sons Ltd., pp. 2090-2095. 14 A W orst case scenario calculations Consider the 3+3 design. Let H b e the index of the highest dose level visited during the trial, and let K = H − d . Note that H is un b ounded from ab o v e, so K is as w ell. No w, r 3+3 ( v ) = 1 − ∞ X k =0 P (0 DL Ts for 3 patients) k P ( K = k | K ≥ k ) P (2 or more DL Ts for 3 patien ts) k = 1 − ∞ X k =0 ((1 − v ) 3 ) k 3 v (1 − v ) 2 (1 − (1 − v ) 3 ) + (3 v 2 (1 − v ) + v 3 ) (3 v 2 (1 − v ) + v 3 ) k = 1 − 3 v (1 − v ) 2 (1 − (1 − v ) 3 ) + (3 v 2 (1 − v ) + v 3 ) 1 − (1 − v ) 3 (3 v 2 (1 − v ) + v 3 ) Lik ewise, for the 2+2 design, r 2+2 ( v ) = 1 − ∞ X k =0 P (0 DL Ts for 2 patients) k P ( K = k | K ≥ k ) P (2 DL Ts for 2 patients) k = 1 − ∞ X k =0 ((1 − v ) 2 ) k 2 v (1 − v )(1 − (1 − v ) 2 ) + v 2 ( v 2 ) k = 1 − 2 v (1 − v )(1 − (1 − v ) 2 ) + v 2 1 − (1 − v ) 2 v 2 and for the 4+4 design, r 4+4 ( v ) = 1 − ∞ X k =0 P (0 DL Ts for 4 patients) k P ( K = k | K ≥ k ) P (2 or more DL Ts for 4 patien ts) k = 1 − ∞ X k =0 ((1 − v ) 4 ) k 4 v (1 − v ) 3 (1 − (1 − v ) 4 ) + (1 − (1 − v ) 4 − 4 v (1 − v ) 3 ) (1 − (1 − v ) 4 − 4 v (1 − v ) 3 ) k = 1 − 4 v (1 − v ) 3 (1 − (1 − v ) 4 ) + (1 − (1 − v ) 4 − 4 v (1 − v ) 3 ) 1 − (1 − v ) 4 (1 − (1 − v ) 4 − 4 v (1 − v ) 3 ) 15 F or the 1+2+3/3+3 design, r 1+2+3 / 3+3 ( v ) = 1 − ∞ X k =0 P (0 DL Ts for 1 patient) k P ( K = k | K ≥ k ) P (2 or more DL Ts for 5 patien ts) k = 1 − ∞ X k =0 (1 − v ) k v (1 − (1 − v ) 5 ) (1 − (1 − v ) 5 − 5 v (1 − v ) 4 ) k = 1 − v (1 − (1 − v ) 5 ) 1 − (1 − v )(1 − (1 − v ) 5 − 5 v (1 − v ) 4 ) 16 T able 1: Rules for the standard design. The design stops enrolling patients after a dose is found where at most 1 of 6 patien ts exp eriences a DL T and where at least 2 DL Ts are observed at the next highest dose. DL Ts P atients Action 0 3 Escalate 1 3 No change ≥ 2 3 De-escalate ≤ 1 6 Escalate ≥ 2 6 De-escalate 17 T able 2: Alternate depiction of the standard design rules. The columns corresp ond to the n um b er of patien ts treated at the curren t dose level, and the rows corresp ond to the n umber of DL Ts at the current dose lev el. The elements sp ecify the action, where ‘E’ means escalate, ‘S’ means stay the same, and ‘DU’ means declare the curren t dose to b e unacceptably toxic and de-escalate. The design stops enrolling patien ts after a dose is found where at most 1 of 6 patients exp eriences a DL T and where at least 2 DL Ts are observed at the next highest dose. 3 6 0 E E 1 S E 2 DU DU 3 DU DU 4 DU 18 Figure 1: max P(c ho ose MTD at a dose where P (DL T MTD ) ≥ v ) vs. v . The designs are 3+3 (solid), 2+2 (dashed), 4+4 (dotted), and 1+2+3/3+3 (dotdashed). 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 v r(v) 19
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment