[MUSIC] In this lecture, we'll talk about how you can quantify the uncertainty surrounding estimates with confidence intervals. We'll talk about the relationship between confidence intervals and P values, and how to correctly interpret a confidence interval. Now, if we do research, we very often collect data from a specific sample with the goal to generalize to an entire population. It's very often not feasible to collect information from the entire population. If you examine people, then you cannot ask your questionnaire to everybody who's alive in the world. Now, sometimes it might be possible to get measurements from the entire population. Let's say that your population is all human beings who ever walked on the moon. Well, there are only 12 people who've done this so far, the 12 people on this slide. So if you want to know something about these 12 people, for example, their length, then you can just calculate it. You don't need any statistics. You don't need to generalize to a population because you've measured the entire population. Now most research, this is not feasible, so you collect data from a sample and then you try to generalize from this sample to the population. And whenever you do this, there's always some uncertainty. Now, it's very important to always quantify and report the uncertainty surrounding point estimates, for example an effect size measure. Whenever you report a statistic that's very important, such as an effect size, you have to report a confidence interval around this effect size estimate. We can visualize this. In this case, we have a correlation. And we've plotted the correlation between two variables. In this graph, we've visualize the uncertainty around the correlation between made up data of an IQ of one twin, and the IQ of the other twin. We can see that this IQ score is related. There's a positive correlation of 0.52, but of course our correlation is an estimate from a sample. If we want to generalize this to the population, then there's always some uncertainty and in this case, the uncertainty is illustrated by the blue area. The blue area illustrates a 95% confidence interval. But how should you interpret a 95% confidence interval? Confidence intervals are a statement about the percentage of future confidence intervals that contain the true parameter value. Now, this is not very intuitive. You might think that a single confidence interval contains the true population parameter, 95% of the time. That would make sense and if a non-statistician might come up with a confidence interval, the concept of a confidence interval. That's probably what we wanted to be. But that's not how they work. The confidence interval is a frequentist concept. It means that they apply to the long run. Now in the long run, 95% of future confidence intervals will contain the true parameter value. So the true effect size, for example, in the population. Because this is counter-intuitive, let's try to visualize it. In this illustration, which we again take from rpsychologist.com, you can see many, many different confidence intervals for the same simulated study. We see blue dots, which are the sample estimates, the effect sizes from the sample that we collected. And the black line is the 95% confidence interval. Now most of the black lines contain the true sample mean. In this case, the true sample mean is zero. And we see that most of the black lines touch upon zero. Sometimes they don't. We calculate it at 95% confidence interval. So 5% of the time in the long run, we will get a confidence interval that does not contain the true population parameter. These are illustrated by the black arrows and they are visualized by red lines. So the red lines do not contain the true population parameter. And this will happen in the long run, approximately 5% of the time. Now it's also clear that randomly chosen single confidence interval here does not contain the true population parameter 95% of the time. And in the assignment, we'll focus a little bit more on this. So 95% of future, 95% confidence intervals will contain the true population parameter in the long run. Never forget that this is a frequentist concept. A single confidence interval after you've collected the data, either contains the true population parameter or it does not. So in this case, you either have it or you don't, but you never know which it is. But in the long run, you can say that 95% of confidence intervals will contain the true population parameter. You can calculate the confidence interval around almost any estimate. But most commonly, we calculate confidence intervals around means or mean differences or effect sizes. And it's very recommended to report them around effect sizes if you report effect sizes, which you should always do. So how do you calculate a 95% confidence interval or different size confidence interval? Again, there's nothing magical about 95%. You can calculate a 90% confidence interval or a 99% confidence interval, if you want to. For the 95% confidence interval around the mean, you calculate it by taking the mean and then adding or subtracting the critical Zed value multiplied by the standard error. Now, it's okay if you don't immediately remember what the standard error is. The standard error is the standard deviation divided by the square root of N, and N is the sample size. So you can see that the larger the sample that we get, the smaller the standard error. And as a consequence, the smaller the confidence interval. If you want to calculate a 95% confidence interval around a mean then, based on the normal distribution, you would multiply the standard error by 1.96. This is a value that you see quite often in calculating confidence intervals. It's good to know that this is just the critical Zed value for a two-sided test. So you're very often see this 1.96 value when people calculate confidence intervals. So as the sample size increases, confidence intervals become more narrow. Let's visualize this. In this graph, we can see an increasing sample size in the horizontal axis. It goes to 100, 200, 300, 400 participants in a two-sided D test. On the vertical axis, we've plotted the effects size, and in this simulation there's no true effect. In other words, the effects size is zero and we see that the effect size estimate varies around this value of zero. The dotted lines are the 95% confidence interval. And we see that as the sample size increases, the 95% confidence interval becomes more narrow. In this simulation, the effect size actually stays very nicely between the 95% confidence interval. But sometimes, 5% of the time, the simulation will give you an effect size measure that falls just outside of these 95% confidence borders. Now it's very important not to incorrectly interpret the confidence interval. And regrettably, the most intuitive interpretation is incorrect. You might feel that if you have a single 95% confidence interval. Then 95% of future estimates will fall within this single confidence interval. But a confidence interval is a statement about future confidence intervals. It's not a statement about future estimates. It says that 95% of future confidence intervals will contain the true population parameter. You can take a look at how often a single 95% confidence interval will contain the true population parameter. Simulation studies have shown that if you perform many, many, many simulations. Then, it turns out that about 84.3% of single 95% confidence intervals contain the population parameter in the long run. And this is known as a capture percentage. So a single 95% confidence interval will capture the true value, 84.3% of the time. Regrettably, this varies quite a lot, it depends, you never really know whether this is the case or not. So this is not a very usable statistic in itself, but a least it's one way to interpret the single 95% confidence before it gives you some idea about how often this single confidence interval will contain the true population parameter. About five in six times, it will be the case. You never really know for a single observation. Now confidence intervals and p-values are directly related. If a single 95% confidence interval does not contain zero, it automatically implies that the p-value is smaller than 0.05. Let's take a look at a meta analysis where we've plotted effect sizes and confidence intervals, so we can visualize this. In this graph, we see what's know as a four split. Every line illustrates a single study. On the horizontal axis, we see the effects size Hedges' G. The four studies vary in the effect size, illustrated by the yellow square. Around an effect size of 0.43. Each effect size has a 95% confidence interval, visualized by the black horizontal line. If we take a look at the third study from the top, we can see that a 95% confidence interval overlaps with the black vertical line at Hedges' G of zero. This implies that the third study did not yield the statistically significant effect. Because the 95% confidence interval around the effect size estimates contains zero. Now, we've been talking about confidence intervals from a frequentist perspective. And here we have the slightly counter-intuitive interpretation of confidence intervals in the long run. So 95% confidence intervals will contain the true population parameter, 95% of the time in the long run, which is a frequentist interpretation. A more intuitive interpretation is found in Bayesian statistics where people calculate credible intervals. Or a slightly different version known as in highest density interval, and these can be interpreted in a more strait forward manner. A 95% credible interval contains all of the values that you find most plausible. In this case, you are quantifying your belief. It doesn't mean that 95% of the true population parameters indeed fall within this area. It doesn't mean that in the long run, you can expect 95% of credible intervals to contain the true population parameter. But it quantifies your belief, these are the values that you find most plausible. Remember that earlier, we visualized the distributions of the prior, in this case, the gray line. The blue dashed line, illustrating the likelihood function which together, create the posterior distribution illustrated by the black line. We can calculate the 95% credible interval, the values that lie between the two shaded areas in the tails. Confidence intervals provide some idea of what will happen in the future. You can use confidence intervals to quantify the uncertainty in your estimates. Whenever you report important parameter estimates, such as effect sizes, you should always accompany these with confidence intervals. [MUSIC]