Leapfrog Exercises

Exercises and scripts to help introduce the ‘leapfrog’ trial design

This current document is shared by Simon Blackwell under a CC BY 4.0 license. Please contact him (simon.blackwell@uni-goettingen.de) if you have any questions, and visit https://www.psych.uni-goettingen.de/en/trace/team/blackwell-simon for further information.

To run these exercises I recommend creating an R script and then copying and pasting the relevant chunks of code into it as you go along.

Packages needed

You will need to install and load the following packages for these exercises.

install.packages("BayesFactor")
install.packages("MASS")
install.packages("devtools") # if not yet installed
library(devtools)
install_github("nicebread/BFDA", subdir="package")

library(BFDA)
library(BayesFactor)
library(MASS)

Exercise 1: Simulating data using BFDA and selecting trial parameters

The Bayes factor design analysis (BFDA) package provides a simple way to simulate and analyse sequential bayesian analyses for trial planning. For more complex analyses you can create your own custom simulations (e.g. see the scripts at https://osf.io/8mxda/ or https://osf.io/b397u/ for examples)

Please see https://rawgit.com/nicebread/BFDA/master/package/doc/BFDA_manual.html for the BFDA manual and full instructions

1. Simulate sequential analyses for planning

A first step is to simulate the results of sequential analyses under i) the alternative hypothesis for the target effect size of interest (to find parameters that give you sufficient power), and ii) the null hypothesis (to find parameters that give a suitable false-positive (Type 1) error rate). Once you have suitable parameters for these, you can then simulate a range of other effect sizes to get a better picture of your power at different effect sizes.

As simulations can take some time, for demonstration purposes in this exercise we will i) attempt to find a medium effect size (d = 0.5), as this requires smaller maximum sample sizes, and ii) only do a small-ish number of simulations (1000), as we are not too worried about accuracy/precision - for planning an actual trial you would want to do a larger number (e.g. 5000)

You can either i) run the simulations yourself using the commented out code below, or ii) download some simulations I already did using this code. These are stored as .rds files here: https://owncloud.gwdg.de/index.php/s/6vv9kBzgFPs50S2 and you can either download them manually from there or use the code below to pull them directly into R.

I have simulated effect sizes between -0.2 (i.e. intervention is worse than control) to 0.4 in steps of 0.2, then up to 0.8 in steps of 0.1. Simulations are done for 3 different prior rscales: sqrt(2)/2 (0.707), 0.5, and 1 so you can see the effect of varying these.

# Simulating BFs if you want to do it yourself (will probably take a few hours / overnight). This is the code I used. 
# This creates 3 lists of sim objects, 1 for each prior, and the between-group effect size is used as a label for each.
# e.g. simsr0.707[["0.2"]] is the simulation for the prior using rscale 0.707 with a between-group effect size of 0.2

simsr0.707<-list()
simsr0.707[["-0.2"]]<-BFDA.sim(expected.ES=-0.2, type="t.between", prior=list("Cauchy", list(prior.location=0, prior.scale=sqrt(2)/2)), n.min=10, n.max=80, alternative="greater", boundary=Inf, B=1000, verbose=TRUE, stepsize = 1)
simsr0.707[["0.0"]]<-BFDA.sim(expected.ES=0, type="t.between", prior=list("Cauchy", list(prior.location=0, prior.scale=sqrt(2)/2)), n.min=10, n.max=80, alternative="greater", boundary=Inf, B=1000, verbose=TRUE, stepsize = 1)
simsr0.707[["0.2"]]<-BFDA.sim(expected.ES=0.2, type="t.between", prior=list("Cauchy", list(prior.location=0, prior.scale=sqrt(2)/2)), n.min=10, n.max=80, alternative="greater", boundary=Inf, B=1000, verbose=TRUE, stepsize = 1)
simsr0.707[["0.4"]]<-BFDA.sim(expected.ES=0.4, type="t.between", prior=list("Cauchy", list(prior.location=0, prior.scale=sqrt(2)/2)), n.min=10, n.max=80, alternative="greater", boundary=Inf, B=1000, verbose=TRUE, stepsize = 1)
simsr0.707[["0.5"]]<-BFDA.sim(expected.ES=0.5, type="t.between", prior=list("Cauchy", list(prior.location=0, prior.scale=sqrt(2)/2)), n.min=10, n.max=80, alternative="greater", boundary=Inf, B=1000, verbose=TRUE, stepsize = 1)
simsr0.707[["0.6"]]<-BFDA.sim(expected.ES=0.6, type="t.between", prior=list("Cauchy", list(prior.location=0, prior.scale=sqrt(2)/2)), n.min=10, n.max=80, alternative="greater", boundary=Inf, B=1000, verbose=TRUE, stepsize = 1)
simsr0.707[["0.7"]]<-BFDA.sim(expected.ES=0.7, type="t.between", prior=list("Cauchy", list(prior.location=0, prior.scale=sqrt(2)/2)), n.min=10, n.max=80, alternative="greater", boundary=Inf, B=1000, verbose=TRUE, stepsize = 1)
simsr0.707[["0.8"]]<-BFDA.sim(expected.ES=0.8, type="t.between", prior=list("Cauchy", list(prior.location=0, prior.scale=sqrt(2)/2)), n.min=10, n.max=80, alternative="greater", boundary=Inf, B=1000, verbose=TRUE, stepsize = 1)

simsr0.5<-list()
simsr0.5[["-0.2"]]<-BFDA.sim(expected.ES=-0.2, type="t.between", prior=list("Cauchy", list(prior.location=0, prior.scale=0.5)), n.min=10, n.max=80, alternative="greater", boundary=Inf, B=1000, verbose=TRUE, stepsize = 1)
simsr0.5[["0.0"]]<-BFDA.sim(expected.ES=0, type="t.between", prior=list("Cauchy", list(prior.location=0, prior.scale=0.5)), n.min=10, n.max=80, alternative="greater", boundary=Inf, B=1000, verbose=TRUE, stepsize = 1)
simsr0.5[["0.2"]]<-BFDA.sim(expected.ES=0.2, type="t.between", prior=list("Cauchy", list(prior.location=0, prior.scale=0.5)), n.min=10, n.max=80, alternative="greater", boundary=Inf, B=1000, verbose=TRUE, stepsize = 1)
simsr0.5[["0.4"]]<-BFDA.sim(expected.ES=0.4, type="t.between", prior=list("Cauchy", list(prior.location=0, prior.scale=0.5)), n.min=10, n.max=80, alternative="greater", boundary=Inf, B=1000, verbose=TRUE, stepsize = 1)
simsr0.5[["0.5"]]<-BFDA.sim(expected.ES=0.5, type="t.between", prior=list("Cauchy", list(prior.location=0, prior.scale=0.5)), n.min=10, n.max=80, alternative="greater", boundary=Inf, B=1000, verbose=TRUE, stepsize = 1)
simsr0.5[["0.6"]]<-BFDA.sim(expected.ES=0.6, type="t.between", prior=list("Cauchy", list(prior.location=0, prior.scale=0.5)), n.min=10, n.max=80, alternative="greater", boundary=Inf, B=1000, verbose=TRUE, stepsize = 1)
simsr0.5[["0.7"]]<-BFDA.sim(expected.ES=0.7, type="t.between", prior=list("Cauchy", list(prior.location=0, prior.scale=0.5)), n.min=10, n.max=80, alternative="greater", boundary=Inf, B=1000, verbose=TRUE, stepsize = 1)
simsr0.5[["0.8"]]<-BFDA.sim(expected.ES=0.8, type="t.between", prior=list("Cauchy", list(prior.location=0, prior.scale=0.5)), n.min=10, n.max=80, alternative="greater", boundary=Inf, B=1000, verbose=TRUE, stepsize = 1)


simsr1.0<-list()
simsr1.0[["-0.2"]]<-BFDA.sim(expected.ES=-0.2, type="t.between", prior=list("Cauchy", list(prior.location=0, prior.scale=1.0)), n.min=10, n.max=80, alternative="greater", boundary=Inf, B=1000, verbose=TRUE, stepsize = 1)
simsr1.0[["0.0"]]<-BFDA.sim(expected.ES=0, type="t.between", prior=list("Cauchy", list(prior.location=0, prior.scale=1.0)), n.min=10, n.max=80, alternative="greater", boundary=Inf, B=1000, verbose=TRUE, stepsize = 1)
simsr1.0[["0.2"]]<-BFDA.sim(expected.ES=0.2, type="t.between", prior=list("Cauchy", list(prior.location=0, prior.scale=1.0)), n.min=10, n.max=80, alternative="greater", boundary=Inf, B=1000, verbose=TRUE, stepsize = 1)
simsr1.0[["0.4"]]<-BFDA.sim(expected.ES=0.4, type="t.between", prior=list("Cauchy", list(prior.location=0, prior.scale=1.0)), n.min=10, n.max=80, alternative="greater", boundary=Inf, B=1000, verbose=TRUE, stepsize = 1)
simsr1.0[["0.5"]]<-BFDA.sim(expected.ES=0.5, type="t.between", prior=list("Cauchy", list(prior.location=0, prior.scale=1.0)), n.min=10, n.max=80, alternative="greater", boundary=Inf, B=1000, verbose=TRUE, stepsize = 1)
simsr1.0[["0.6"]]<-BFDA.sim(expected.ES=0.6, type="t.between", prior=list("Cauchy", list(prior.location=0, prior.scale=1.0)), n.min=10, n.max=80, alternative="greater", boundary=Inf, B=1000, verbose=TRUE, stepsize = 1)
simsr1.0[["0.7"]]<-BFDA.sim(expected.ES=0.7, type="t.between", prior=list("Cauchy", list(prior.location=0, prior.scale=1.0)), n.min=10, n.max=80, alternative="greater", boundary=Inf, B=1000, verbose=TRUE, stepsize = 1)
simsr1.0[["0.8"]]<-BFDA.sim(expected.ES=0.8, type="t.between", prior=list("Cauchy", list(prior.location=0, prior.scale=1.0)), n.min=10, n.max=80, alternative="greater", boundary=Inf, B=1000, verbose=TRUE, stepsize = 1)

#To load my previous ones directly from owncloud:

simsr0.707<-readRDS(url("https://owncloud.gwdg.de/index.php/s/6vv9kBzgFPs50S2/download?path=%2F&files=simsr0707.RDS"),"rb")
simsr0.5<-readRDS(url("https://owncloud.gwdg.de/index.php/s/6vv9kBzgFPs50S2/download?path=%2F&files=simsr05.RDS"),"rb")
simsr1.0<-readRDS(url("https://owncloud.gwdg.de/index.php/s/6vv9kBzgFPs50S2/download?path=%2F&files=simsr10.RDS"),"rb")

#If you downloaded them to your working/project directory, you could load them via e.g.
# simsr0.707<-readRDS("simsr0707.RDS")

As an explanation, these simulate sequential Bayes-factor based analyses based on:

Effect sizes ranging from -0.2 (expected.ES=-0.2) to d=0.8
a between-group t-test, e.g. difference between two groups in change in symptoms (type=“t.between”)
use a default Cauchy prior with a rscale value of 0.707 (prior=list(“Cauchy”, list(prior.location=0, prior.scale=sqrt(2)/2))), 0.5, or 1 [we won’t go into priors in much detail here but default ones like this work fine! See Stefan et al. (2019) for discussion of using informed priors with BFDA, https://link.springer.com/article/10.3758/s13428-018-01189-8]
a minimum sample size (per arm) of 10 (n.min = 10) - note that when trying to find parameters later you can choose higher minimum sample sizes, but not lower
a maximum sample size (per arm) of 80 (n.max = 80) - note that when trying out parameters later on you can try lower, but not higher maximum sample sizes
a directional (i.e. one-tailed) Bayes Factor (alternative = “greater” and boundary = Inf). We are interested in whether one treatment is superior to another, not whether it is different.
we use 1000 simulations (B = 1000) - fine for getting a feel, but for finalising your study design you would want a larger number e.g. 5000
we carry out the analysis every 1 participants (stepsize = 1). If you are looking at larger samples you could alternatively repeat the analysis every 5 or 10 participants.

If you were carrying out large important simulations that took a long time you would probably want to save them so that you can load them again later without having to re-run the simulations:

# Saving a BFDA simulation object:
# saveRDS(simsr0.707,"simsr0707.dd.mm.yy.RDS")

# to load again
# simsr0.707<-readRDS("simsr0707.dd.mm.yy.RDS")

2. Try out different analysis parameters to find a set that fits your requirements

Overview

Now that you have a simulated set of sequential BFs you can test out what would happen if you applied different sets of analysis parameters to them.

These are:

Nmin: The minimum sample size (per arm) at which you start the sequential analyses
Nmax: The maximum sample size (per arm) at which you drop an arm (if it has not already hit a BF boundary)
BFfail: A BF threshold for failure (i.e. sufficient evidence for the null hypothesis of non-superiority to the control condition vs. the alternative hypothesis of superiority). This will be a value less than 1 (e.g. 1/3, 1/5, 1/10 etc)
BFsuccess: A BF threshold for success (i.e. sufficient evidence for the alternative hypothesis of superiority over the control condition vs.the null hypothesis of non-superiority to the control condition ). This will be a value greater than 1 (e.g. 3, 5, 10)
The rscale for the analysis prior. Generally, lower values indicate higher probability of smaller effect sizes under the alternative hypothesis and will tend to make it easier to hit BFsuccess for smaller effect sizes, but harder to hit BFfail for d = 0.

We are interested in:

power: what proportion of arms hit the BFsuccess threshold when d > 0
false-positive / Type 1 error rate: what proportion of arms hit BF success threshold when d = 0 (or d < 0)
keeping the average sample sizes as low as possible

There are several functions in BFDA to analyse simulation outcomes, but here we will just use the plot function. We will start by using the boundary conditions of the simulations (i.e. n.min = 10 and n.max = 80), and a default set of starting boundaries of BFfail = 1/5 and BFsuccess = 5 (boundary=c(1/5, 5)). The parameter n.trajectories just tells it how many lines to draw on the graph (representing individual BF trajectories) - these are very useful to give you an insight on how BFs tend to develop over time.

#First plot for H1
dev.new() #sometimes the plot doesn't work if you don't make a new window first
plot(simsr0.707[["0.5"]], n.min=10, n.max=80, boundary=c(1/5, 5), n.trajectories = 60)

#Then plot for H0
dev.new() #sometimes the plot doesn't work if you don't make a new window first
plot(simsr0.707[["0.0"]], n.min=10, n.max=80, boundary=c(1/5, 5), n.trajectories = 60)

#There are other functions in BFDA to explore the simulations/sequential analyses but I find the plots the most informative. However, feel free to explore.
# For example BFDA.analyze:

BFDA.analyze(simsr0.707[["0.5"]],design=c("sequential"),boundary=c(1/5,5),n.min=10,n.max=80)
BFDA.analyze(simsr0.707[["0.0"]],design=c("sequential"),boundary=c(1/5,5),n.min=10,n.max=80)

Exercise steps

Play around with the parameters (n.min, n.max, the BF boundaries, and using the different prior rscales [start with 0.707 as default]) and see what happens. e.g. increasing n.min will tend to reduce error rates, but means you lose the chance to make decisions so quickly. Remember that the BF boundaries don’t have to be symmetrical (e.g. you could use boundary=c(1/3,10) if you weren’t so worried about false-negatives but were very concerned about potential false-positives)
Try to come up with a set of parameters that gives you 80% power and a false-positive (type 1) error rate of <5%, i.e. for H1, ≥ 80% stopping at H1 boundary, for H0, < 5% (i.e. 4%) stopping at h1 boundary.
Once you find this you can see if you can improve on these, to reduce the potential sample sizes needed.
See what proportion of participants will hit BFsuccess (H1 boundary) for sim.H1 at different sample sizes (by adjusting n.max): the value for which 50% are stopping at the nmax boundary tells you the average sample size you might expect if d=0.5 with these parameters. You can then do the same for sim.H0. It might be that you can find a set of parameters that give you smaller average sample size predictions.
Now have a look and see what happens at other effect sizes, e.g. what if you power for d=0.5, but d=0.4, d=0.3 etc?

Summarising the simulations for planning

One way to collect this information is into a table, as provided in the examples in Table 2 and Table 3 in the paper by Blackwell et al. (2019) https://journals.sagepub.com/doi/full/10.1177/2167702619858071

Table 2 from the paper is reproduced below. This illustrates a particular set of parameters chosen for a small to medium between-group effect size equivalent to Cohen’s d = 0.4: Nmin = 35 (per arm), Nmax = 125 (per arm), BFfail = 1/4, and a BFsuccess = 5 (and directional default Cauchy prior, rscale parameter = √2/2).

Here we can see that we have a false-positive rate of < 5% (top row), and 81% to find d = 0.4. We can also see that 54% of the time, we would hit BFfail at Nmin when d = 0, and 54% of the time we would stop the trial at n = 50 per arm when d = 0.4 (8% at BFfail, 46% at BFsuccess):

“True” effect size (Cohen’s d)	Probability of reaching threshold at each participant number (per group)
	Discontinuation threshold					Replacement threshold
	n = 35	n = 50	n = 75	n = 100	n = 125	n = 35	n = 50	n = 75	n = 100	n = 125
0 (null)	54	70	81	86	89	1	3	3	4	4
0.1	37	52	62	68	71	4	7	10	12	13
0.2	22	33	41	45	47	8	15	23	28	32
0.3	11	18	22	24	25	16	29	43	52	58
0.4	5	8	10	10	11	28	46	65	75	81
0.5	2	3	4	4	4	43	64	82	91	94
0.6	1	1	1	1	1	60	80	94	97	98
0.7	0	0	0	0	0	75	91	98	100	100
0.8	0	0	0	0	0	88	97	100	100	100

Doing these plots is quite a long-winded way to arrive at such a table, but it at least gives you a good feel for what happens when you change the parameters. If you were doing your own simulations you could write a script to output a table automatically (e.g. see the scripts at https://osf.io/8mxda/ and https://osf.io/b397u/ for example)

Exercise 2: Carry out sequential analysis of (simulated) trial data

Overview and setup

In this exercise you will analyse some (simulated) data using the trial parameters you selected, and find out when you hit the BF boundaries.

First you will need to simulate the trial data - please run the code below (which I’m happy to explain but there’s no need to understand it for the exercise - it’s just a means to provide you with the data to analyse):

#Simulates 100 participants in each arm of trial where:
#There is a correlation of r = 0.35 between pre and post-data
#One arm is not superior to control, the other is
set.seed(19112022)
samples=100
r=0.35
data<-mvrnorm(n=samples,mu=c(1,0.8),Sigma=matrix(c(1,r,r,1),nrow=2),empirical=FALSE)
X = data[,1]
Y = data[,2]
X<-as.integer(X*5+20)
Y<-as.integer(Y*5+20)
data<-mvrnorm(n=samples,mu=c(1,0.15),Sigma=matrix(c(1,r,r,1),nrow=2),empirical=FALSE)
X1 = data[,1]
Y1 = data[,2]
X1<-as.integer(X1*5+20)
Y1<-as.integer(Y1*5+20)
data<-mvrnorm(n=samples,mu=c(1,0.9),Sigma=matrix(c(1,r,r,1),nrow=2),empirical=FALSE)
X2 = data[,1]
Y2 = data[,2]
X2<-as.integer(X2*5+20)
Y2<-as.integer(Y2*5+20)
cdiff<-Y-X
T1diff<-Y1-X1
T2diff<-Y2-X2
predata<-c(X,X1,X2)
postdata<-c(Y,Y1,Y2)
diffdata<-postdata-predata
group<-c(rep("C",samples),rep("Tx1",samples),rep("Tx2",samples))
dseq<-seq(1,samples*3)
pid<-as.character(seq(1,samples*3))
mydata<-(cbind(predata,postdata,diffdata,group))
mydata<-mydata[sample(1:nrow(mydata)), ]
mydata<-data.frame(cbind(pid,dseq,mydata))
mydata$dseq<-as.integer(mydata$dseq)
mydata$predata<-as.integer(mydata$predata)
mydata$postdata<-as.integer(mydata$postdata)
mydata$diffdata<-as.integer(mydata$diffdata)

You now have a dataframe (mydata) with pre and post outcome data for a pretend 3-arm trial of treatments for depression (i.e. a decrease in score on the outcome measure is good). There are three treatment arms, “C” (control condition, e.g. TA), “Tx1” (New treatment 1) and “Tx2” (New treatment 2). One arm is superior to control, one isn’t. Each arm includes 100 participants.

Dataframe columns are:

pid = participant id (a string of a number from 1 to 300)
dseq = sequence in which they provided outcome data for the trial (numerical, here the same as pid for convenience)
predata = score on the depression outcome measure at pre-treatment
postdata = score on the depression outcome measure at post-treatment
diffdata = post-treatment score minus pre-treatment score
group = which group (C, Tx1, Tx2)

Exploring the data

Take a look at the data (e.g. with View(mydata) ) to get a sense of it. You can then use the functions in the code chunk below to explore the potential outcome of applying sequential Bayesian analyses.

First, run the code below to load the two functions:

BFsnapshot<-function(BFdata,n=300,rs=(sqrt(2)/2)){

  if (length(BFdata$pid)<n){
    return("Error: n larger than number of participants");
  }
  
  tryCatch(
        {
          tdata<-BFdata[1:n,]
          cat("\n\n Total N = ",n,"\n\n")
          cat("Control: n = ",length(tdata[tdata$group=="C",]$pid),", mean change (post minus pre-treatment) = ", mean(tdata[tdata$group=="C",]$diffdata,na.rm=TRUE)," (SD = ", sd(tdata[tdata$group=="C",]$diffdata,na.rm=TRUE),")\n\n",sep="")
         cat("Tx1: n = ",length(tdata[tdata$group=="Tx1",]$pid),", mean change (post minus pre-treatment) = ", mean(tdata[tdata$group=="Tx1",]$diffdata,na.rm=TRUE)," (SD = ", sd(tdata[tdata$group=="Tx1",]$diffdata,na.rm=TRUE),")\n\n",sep="")
         #effect size calculation
         d<-effectsize::cohens_d(tdata[tdata$group=="C",]$diffdata,tdata[tdata$group=="Tx1",]$diffdata)
         cat("Effect size vs. control: d=",d$Cohens_d," 95% CIs [",d$CI_low,",",d$CI_high,"]\n\n",sep="")
         #Calculates directional Bayesian t-test (nullInterval=c(0,Inf))
         BF<-BayesFactor::ttestBF(x=tdata[tdata$group=="C",]$diffdata,y=tdata[tdata$group=="Tx1",]$diffdata,nullInterval=c(0,Inf),rscale=rs)
        cat("BF vs. control: BF=",exp(BF@bayesFactor$bf[[1]]),"\n\n",sep="")

        cat("Tx2: n = ",length(tdata[tdata$group=="Tx2",]$pid),", mean change (post minus pre-treatment) = ", mean(tdata[tdata$group=="Tx2",]$diffdata,na.rm=TRUE)," (SD = ", sd(tdata[tdata$group=="Tx2",]$diffdata,na.rm=TRUE),")\n\n",sep="")
        #effect size calculation
         d<-effectsize::cohens_d(tdata[tdata$group=="C",]$diffdata,tdata[tdata$group=="Tx2",]$diffdata)
         cat("Effect size vs. control: d=",d$Cohens_d," 95% CIs [",d$CI_low,",",d$CI_high,"]\n\n",sep="")
         #Calculates directional Bayesian t-test (nullInterval=c(0,Inf))
         BF<-BayesFactor::ttestBF(x=tdata[tdata$group=="C",]$diffdata,y=tdata[tdata$group=="Tx2",]$diffdata,nullInterval=c(0,Inf),rscale=rs)
        cat("BF vs. control: BF=",exp(BF@bayesFactor$bf[[1]]),"\n\n",sep="") 
          
        },
        error=function(cond) {
          return (paste0("Something went wrong! ",cond))
        })
}

seqBFs<-function(BFdata,rs=(sqrt(2)/2)){
  tryCatch(
        {
          alldata<-BFdata[1:length(BFdata$pid),]
          dsTx1<-vector()
          dsTx2<-vector()
          BFsTx1<-vector()
          BFsTx2<-vector()
          nsTx1<-vector()
          nsTx2<-vector()
          is<-vector()
          #start at index number 10 to avoid problems
          for (i in 10:length(alldata$pid)){
          tdata<-alldata[1:i,]
          Ns<-as.data.frame(cbind(table(tdata[1:i,]$group)))
          dsTx1<-c(dsTx1,(effectsize::cohens_d(tdata[tdata$group=="C",]$diffdata,tdata[tdata$group=="Tx1",]$diffdata))$Cohens_d)
          BFsTx1<-c(BFsTx1,exp((BayesFactor::ttestBF(x=tdata[tdata$group=="C",]$diffdata,y=tdata[tdata$group=="Tx1",]$diffdata,nullInterval=c(0,Inf),rscale=rs)@bayesFactor$bf[[1]])))
            nsTx1<-c(nsTx1,Ns["Tx1",])
            
            dsTx2<-c(dsTx2,(effectsize::cohens_d(tdata[tdata$group=="C",]$diffdata,tdata[tdata$group=="Tx2",]$diffdata))$Cohens_d)
            BFsTx2<-c(BFsTx2,exp((BayesFactor::ttestBF(x=tdata[tdata$group=="C",]$diffdata,y=tdata[tdata$group=="Tx2",]$diffdata,nullInterval=c(0,Inf),rscale=rs)@bayesFactor$bf[[1]])))
            nsTx2<-c(nsTx2,Ns["Tx2",])
            is<-c(is,i)
          
        }
          allresults<-data.frame(cbind(is,nsTx1,dsTx1,BFsTx1,nsTx2,dsTx2,BFsTx2))
          return(allresults)
        },
        error=function(cond) {
          return (paste0("Something went wrong! ",cond))
        })
}

You can now use these two functions to explore the effect of doing sequential Bayesian analyses (here for simplicity, t-test on change scores).

Function: BFsnapshot()

BFsnapshot() can be used to give you a snapshot of the data at a particular point in time. Arguments are:

BFdata = the dataframe with the data (i.e. here it is mydata)
n = the total sample size / sequence number (up to a max of 300) you want to take the snapshot at (e.g. if you put 100 in here, it will give you a snapshot of the analysis outcomes at the point there were 100 total participants in the trial)
rs = you can ignore unless you want to try change the rscale parameter (sets to √2/2 as default)

e.g. BFsnapshot(mydata,100) would give you analysis output when there are 100 people in the trial

Function: seqBFs()

seqBFs() outputs a dataframe with sequential BFs over the course of the simulated trial. You just pass it the dataframe (and can specify a rscale parameter if you want to change this)

trialresults<-seqBFs(mydata)

will give you a dataframe (trialresults), which you can then investigate (e.g. via View(trialresults), or you can try making some plots, e.g. plot(trialresults$nsTx1,trialresults$BFsTx1) or plot(trialresults$nsTx1,trialresults$BFsTx1,log="y") - a log scale is often clearer) to see how the BFs develop over time.

Columns in the dataframe are:

is = index / total N participants in the trial
nsTx1 = n in Tx1 arm
dsTx1 = effect size (cohen’s d) for Tx1 vs control
BFsTx1 = Bayes Factor for Tx1 vs control
nsTx2 = n in Tx2 arm
dsTx2 = effect size (cohen’s d) for Tx2 vs control
BFsTx2 = Bayes Factor for Tx2 vs control

Exercise task:

See if you can work out what would have happened if you used your chosen trial parameters on this data set. When would you hit a BF boundary for each treatment arm (if at all)?

Exercise 3: Approximate and (some) fully Bayesian analyses for more complex analyses

Compare the BFs you would get using different methods, just focusing on Tx1 vs. control

#Here is the fully-Bayesian t-test using the BayesFactor package:
ttestBF(x=mydata[mydata$group=="C",]$diffdata,y=mydata[mydata$group=="Tx1",]$diffdata,nullInterval=c(0,Inf),rscale=0.707)

#to extract the BF separately you need to use exp as it stores the log:
exp(ttestBF(x=mydata[mydata$group=="C",]$diffdata,y=mydata[mydata$group=="Tx1",]$diffdata,nullInterval=c(0,Inf),rscale=0.707)@bayesFactor$bf[[1]])

# For approximate version. The function asks for sample sizes, but we will use the df from the t-test and the formula
# df = n1 + n2 - 2 (useful for later...). It stores the logBF so we need to use exp()
tt<-t.test(mydata[!mydata$group=="Tx2",]$diffdata~mydata[!mydata$group=="Tx2",]$group)

exp(ttest.tstat(tt$statistic,(tt$parameter+2)/2,(tt$parameter+2)/2,nullInterval=c(0,Inf),rscale=0.707)$bf)

# Run as a mixed model
library(lmerTest)
library(reshape2)

#reshape into long format
mydataL<-melt(mydata[!mydata$group=="Tx2",c("pid","dseq","predata","postdata","group")],id=c("pid","dseq","group"))
colnames(mydataL)<-c("pid","dseq","group","time","score")
(smod<-summary(mod<-lmer(score~time*group + (1|pid),data=mydataL,REML=FALSE)))
smod$coefficients

#How could you extract an approximate BF? (see bottom of the document for a solution)

#ANCOVA version
smod<-summary(lm(postdata~predata+group,data=mydata[!mydata$group=="Tx2",]))

#How could you extract an approximate BF? (see bottom of the document for a solution)

# ANCOVA-mixed model equivalent, see:
# https://solomonkurz.netlify.app/blog/2022-06-13-just-use-multilevel-models-for-your-pre-post-rct-data/
# Need to have time as continuous variable
mydataL$time2<-ifelse(mydataL$time=="predata",0,1)
(smod<-summary(mod<-lmer(score~1+time2+group:time2 + (1|pid),data=mydataL,REML=FALSE)))

#How could you extract an approximate BF? (see bottom of the document for a solution)

#Binary outcomes 
#Let's pretend that scores < 20 are recovery
mydata$recovered<-ifelse(mydata$postdata<20,1,0)
#N recovered in each group
table(mydata$recovered,mydata$group)
#Normal frequentist logistic regression

#frequentist logistic regression for odds ratio of recovery
summary(fit <- glm(recovered ~ group, family = binomial(link="logit"), data = mydata[!mydata$group=="Tx2",]))
#Odds ratio:
exp(fit$coefficients["groupTx1"])


#Bayesian:
#https://cran.r-project.org/web/packages/rstanarm/vignettes/glmer.html#glms-with-group-specific-terms
#https://cran.r-project.org/web/packages/bayestestR/bayestestR.pdf

library(rstanarm)
library(bayestestR)

rs<-stan_glm(recovered ~ group, family = binomial(link="logit"), data = mydata[!mydata$group=="Tx2",])
bayesfactor(rs, direction="right",verbose = FALSE)


#Now also controlling for pre-data

#frequentist logistic regression for odds ratio of recovery
summary(fit <- glm(recovered ~ group+predata, family = binomial(link="logit"), data = mydata[!mydata$group=="Tx2",]))
#Odds ratio:
exp(fit$coefficients["groupTx1"])

rs<-stan_glm(recovered ~ group+predata, family = binomial(link="logit"), data = mydata[!mydata$group=="Tx2",])
bayesfactor(rs, direction="right",verbose = FALSE)

Exercise 4: Small ds

If you want to get an idea of how things are when looking for smaller effect sizes (e.g. d = 0.2), the following is taken from the paper:

Blackwell, S. E. (2024). Using the ‘Leapfrog’ Design as a Simple Form of Adaptive Platform Trial to Develop, Test, and Implement Treatment Personalization Methods in Routine Practice. Administration and Policy in Mental Health and Mental Health Services Research, 51, 686-701. https://doi.org/10.1007/s10488-023-01340-4

#This shows you how the simulations were done, but you don't need to repeat these as you can download my ones via the command given later. Note the stepsize of 10.

#simulate effect sizes from -0.2 to +0.5

sim.H0_m2 <- BFDA.sim(expected.ES=-0.2, type="t.between",
                      prior=list("Cauchy", list(prior.location=0, prior.scale=0.5)),
                      n.min=150, n.max=500, alternative="greater", boundary=Inf, B=1000,
                      verbose=TRUE, stepsize = 10)
sim.H0_m1 <- BFDA.sim(expected.ES=-0.1, type="t.between",
                      prior=list("Cauchy", list(prior.location=0, prior.scale=0.5)),
                      n.min=150, n.max=500, alternative="greater", boundary=Inf, B=1000,
                      verbose=TRUE, stepsize = 10)
sim.H0 <- BFDA.sim(expected.ES=0, type="t.between",
                   prior=list("Cauchy", list(prior.location=0, prior.scale=0.5)),
                   n.min=150, n.max=500, alternative="greater", boundary=Inf, B=1000,
                   verbose=TRUE, stepsize = 10)
sim.H1.1_2<-BFDA.sim(expected.ES=0.1, type="t.between",
                     prior=list("Cauchy",list(prior.location=0, prior.scale=0.5)),
                     n.min=150, n.max=500, alternative="greater", boundary=Inf, B=1000,
                     verbose=TRUE, stepsize = 10)
sim.H1_2 <- BFDA.sim(expected.ES=0.2, type="t.between",
                     prior=list("Cauchy",list(prior.location=0, prior.scale=0.5)),
                     n.min=150, n.max=500, alternative="greater", boundary=Inf, B=1000,
                     verbose=TRUE, stepsize = 10)
sim.H1.3_2 <- BFDA.sim(expected.ES=0.3, type="t.between",
                       prior=list("Cauchy",list(prior.location=0, prior.scale=0.5)),
                       n.min=150, n.max=500, alternative="greater", boundary=Inf, B=1000,
                       verbose=TRUE, stepsize = 10)
sim.H1.4_2<-BFDA.sim(expected.ES=0.4, type="t.between",
                     prior=list("Cauchy",list(prior.location=0, prior.scale=0.5)),
                     n.min=150, n.max=500, alternative="greater", boundary=Inf, B=1000,
                     verbose=TRUE, stepsize = 10)
sim.H1.5_2<-BFDA.sim(expected.ES=0.5, type="t.between",
                     prior=list("Cauchy",list(prior.location=0, prior.scale=0.5)),
                     n.min=150, n.max=500, alternative="greater", boundary=Inf, B=1000,
                     verbose=TRUE, stepsize = 10)

allsims<-list()
allsims$sim.Hm0.2<-sim.H0_m2
allsims$sim.Hm0.1<-sim.H0_m1
allsims$sim.H0<-sim.H0
allsims$sim.H0.1<-sim.H1.1_2
allsims$sim.H0.2<-sim.H1_2
allsims$sim.H0.3<-sim.H1.3_2
allsims$sim.H0.4<-sim.H1.4_2
allsims$sim.H0.5<-sim.H1.5_2

# Or load the simulations I did for the paper
allsims<-readRDS(url("https://owncloud.gwdg.de/index.php/s/6vv9kBzgFPs50S2/download?path=%2F&files=allSimsSmallds.RDS"),"rb")
                    
#see effects of chosen analysis parameters up to nmax
BFDA.analyze(allsims$sim.Hm0.2, design="sequential", n.min=150, n.max=450, boundary=c(1/5,3))
BFDA.analyze(allsims$sim.Hm0.1, design="sequential", n.min=150, n.max=450, boundary=c(1/5,3))
BFDA.analyze(allsims$sim.H0, design="sequential", n.min=150, n.max=450, boundary=c(1/5,3))
BFDA.analyze(allsims$sim.H0.1, design="sequential", n.min=150, n.max=450, boundary=c(1/5,3))
BFDA.analyze(allsims$sim.H0.2, design="sequential", n.min=150, n.max=450, boundary=c(1/5,3))
BFDA.analyze(allsims$sim.H0.3, design="sequential", n.min=150, n.max=450, boundary=c(1/5,3))
BFDA.analyze(allsims$sim.H0.4, design="sequential", n.min=150, n.max=450, boundary=c(1/5,3))
BFDA.analyze(allsims$sim.H0.5, design="sequential", n.min=150, n.max=450, boundary=c(1/5,3))

#arrange numbers for table
nmin<-150
nmax<-450
boundaries<-c(1/5,3)
ns<-c(150,200,250,350,450)
allrows<-c(0,ns,ns)

for (i in 1:length(allsims)){
  thissim<-BFDA.analyze(allsims[[i]],design="sequential",n.min=nmin,n.max=nmax,boundary=boundaries)
  nsims<-length(thissim$endpoint$n)
  thisrow<-c(mean(thissim$endpoint$true.ES,na.rm=TRUE))
  criterion<-"fail"
  for (j in 1:(2*length(ns))){
    if (criterion=="fail"){
      thisn<-ns[j]
      thissim<-BFDA.analyze(allsims[[i]],design="sequential",n.min=nmin,n.max=thisn,boundary=boundaries)
      thisrow<-c(thisrow,thissim$lower.hit.frac*100)
      if (j==length(ns)){
        criterion<-"success"
      }
    }else if (criterion=="success"){
      thisn<-ns[(j-5)]
      thissim<-BFDA.analyze(allsims[[i]],design="sequential",n.min=nmin,n.max=thisn,boundary=boundaries)
      thisrow<-c(thisrow,thissim$upper.hit.frac*100)
    }
  }
  allrows<-rbind(allrows,thisrow)
  
}
print(allrows)

#output as styled table in the viewer window (so you can copy and paste it out if wanted)

library(kableExtra)
knitr::kable(allrows) %>% kable_styling(bootstrap_options="striped",font_size=12)

Exercise 5: Next steps:

If you want to take this further you might want to simulate more complex analyses (e.g. mixed models) and multi-arm trials - see the examples at https://osf.io/8mxda/ and https://osf.io/b397u/ for ideas.

# The scripts at https://osf.io/b397u/ (RetrieveSims.13.09.24.r, direct link: https://osf.io/3fbnc) were used for planning this study:

# Burkhardt, G., Blackwell, S. E., Chen, M., Feldmann, L., Björklund, J., Dechantsreiter, E., Bulubas, L., Goerigk, S., Keeser, D., Falkai, P., Greimel, E., Bechmann, P., Schulte-Körne, G., Hasan, A., Strube, W., & Padberg, F. (in press). Intermittent theta burst stimulation in adolescents and young adults with depressive disorders: Protocol of a randomized, sham-controlled study with a sequential Bayesian design for adaptive trials. European Archives of Psychiatry and Clinical Neuroscience. https://doi.org/10.1007/s00406-024-01926-5

# You can use the script to play about with, via loading the following (previously saved) simulations (note these files are a few hundred MBs:

allSims<-readRDS(url("https://owncloud.gwdg.de/index.php/s/6vv9kBzgFPs50S2/download?path=%2F&files=allSims14.7.24.rds"),"rb")
allSimsm15<-readRDS(url("https://owncloud.gwdg.de/index.php/s/6vv9kBzgFPs50S2/download?path=%2F&files=allSimsm15.18.7.24.rds","rb"))

Approximate BF illustrations (from exercise 3)

#Illustrations of extracting approx BFs from mixed model and ANCOVA

#Mixed model:
(smod<-summary(mod<-lmer(score~time*group + (1|pid),data=mydataL,REML=FALSE)))
exp(ttest.tstat(-smod$coefficients["timepostdata:groupTx1",][[4]],(smod$coefficients["timepostdata:groupTx1",][[3]]+2)/2,(smod$coefficients["timepostdata:groupTx1",][[3]]+2)/2,nullInterval=c(0,Inf),rscale=0.707)$bf)

#ANCOVA version:
(smod<-summary(lm(postdata~predata+group,data=mydata[!mydata$group=="Tx2",])))
exp(ttest.tstat(-smod$coefficients["groupTx1",][[3]],(smod$df[[2]]+2)/2,(smod$df[[2]]+2)/2,nullInterval=c(0,Inf),rscale=0.707)$bf)

# ANCOVA-mixed model equivalent, see:
# https://solomonkurz.netlify.app/blog/2022-06-13-just-use-multilevel-models-for-your-pre-post-rct-data/
# Need to have time as continuous variable
mydataL$time2<-ifelse(mydataL$time=="predata",0,1)
(smod<-summary(mod<-lmer(score~1+time2+group:time2 + (1|pid),data=mydataL,REML=FALSE)))
exp(ttest.tstat(-smod$coefficients["time2:groupTx1",][[4]],(smod$coefficients["time2:groupTx1",][[3]]+2)/2,(smod$coefficients["time2:groupTx1",][[3]]+2)/2,nullInterval=c(0,Inf),rscale=0.707)$bf)