Tutorial 5: Clinical trial
This tutorial shows how to do Bayesian model selection in Infer.NET to determine if a new medical treatment is effective. We will construct two models, corresponding to an effective or ineffective treatment, and use model selection to determine the posterior probability of each, given some fictional clinical trial data.
You can run the code in this tutorial either using the Examples Browser or by opening the Tutorials solution in Visual Studio and executing ClinicalTrial.cs.
A healthy challenge
The data in this tutorial consists of the outcomes for individuals who took part in a fictional clinical trial. Each individual was either given the new treatment or given a placebo (individuals given a placebo are in the control group). A good outcome is indicated by true and a bad one by false. Here is the data:
Notice that we have also set up a couple of ranges i and j, which range over the people in the control group and in the treated group respectively. We'll use these later.
To determine whether the treament is effective, we will build two models of this data: one which assumes the treatment has an effect and one which doesn't. To perform Bayesian model selection, we need to introduce a boolean random variable which switches between the two models. In this analysis, we will give this variable a uniform prior. What this prior should be in the case of a real clinical trial would require some thought - what is the a priori effectiveness of a new treatment?
Cause and effect
First, let us consider if the treatment has an effect on the outcome. In this case the probability of a good outcome will be different for people in the control group and the treated group. Because we don't know these two probabilities, we define random variables for them with Beta priors and learn them during inference. The code for model is shown in the snippet below. To achieve model selection, we put this modelling code in an if block, so that the model only applies if isEffective is true.
Branching on variables to create mixture models
Computing model evidence for model selection.
The variables probIfTreated and probIfControl are declared outside of the if block but defined inside. This means the variables can be referred to outside of the using statement, which will allow us to infer their values later.
Notice that we have not specified whether the treatment has a good effect or not, only that it has some effect. We will be able to see if it is a good effect by comparing the posterior distributions over probIfTreated and probIfControl.
A bit of background
Now let us consider the alternative model, where the treatment has no effect i.e. the background model. In this case, the probability of a good outcome will be the same for people in both groups. Again, the value of this probability is unknown, so we will put a Beta prior on it. This time we use Variable.IfNot to create the surrounding if block, so that the model will apply in the case where isEffective is false. You can think of this as being the else clause for the previous if block.
The variable probAll is both declared and defined inside the if block, since we will not be using it later on.
We have now fully defined the model and can go ahead and infer the distributions of interest.
When we run this code, it prints out:
Hence, there is some evidence from this data that the treatment has an effect and, furthermore, the effect is a positive one.
If you find these tutorials to be effective, you can move on to the next.