[MUSIC] In this lecture we'll talk about p-curve Analysis. A recently developed math-analytical tool that allows you to judge the evidential value in sets of studies in the published literature. What do p-values look like from 100 studies where there is no true effect? We've discussed this in an earlier lecture. If you remember correctly, p-values are uniformly distributed if there's no true effect. Now if you've watched enough medical series, you know that the flat line is no good news, right? So in this case, if we see a p-value distribution, that's uniformly distributed then nothing is going on. What do p-values look like from 100 studies where there is a true effect? Well in this case we know that the p-value distribution depends on the statistical power of the test. In general we can expect more low p-values around 0.01 for example then high p-values around 0.05. Even though all these p-values are statistically significant, they're all smaller than 0.05 they're not all equally likely if there is a true effect. We should find low p-values more often than relatively high p-values. p-curve analysis allows you to test whether a set of p-values has evidential value. In this case evidential value is defined as: does the p-value distribution look more like a uniform distribution, where there's no true effect, or more like a p-value distribution where these is a true effect? An important benefit of P-curve analysis is that it's a key to the file-drawer. The test is only performed on p-values smaller than 0.05. Given publication bias, where we know studies are much more likely to be published when they are statistically significant, than when they are not statistically significant. An important benefit of p-Curve analysis is that you only need the statistically significant study results to perform the test. So you can perform it even though there is publication bias and that's a huge important benefit. Another useful thing about p-curve analysis is that you can perform the test online, you don't need any statistical software. Go to the online test, enter test statistics and perform all the required calculations. The question we're trying to address with a p-curve analysis is does the p-value distribution look like one with, or one without a true effect. A good moment to perform a p-curve analysis is before you set out to build on studies in the published literature. You might want to know whether this studies in the public literature are just there due to publication bias, and they are all basically type one errors, there's no true effect. Or whether the results look robust enough to build on. So we're trying to differentiate between these two possible p-value distributions, uniform or what's known as right skew. The green line in this case illustrates a true effect. Now let's take a look at an example. I perform the p-curve analysis of two related lines of research. On the one hand elderly priming studies and on the other hand professor priming studies. Now the evidence in these studies was debated by the publication of field replication studies. Some people responded to these field replications by indicating that there are many, many different studies in the literature that show related effects. Now this is a valid reply, except that if you want to say that there's evidence in the published literature, then you need to quantify this. You can do this either by performing traditional data analysis or you can perform a p-curve analysis and that's what I did in this case. Now this is the p-Curve analysis for elderly priming studies. Eldery priming examines the idea that if you expose people to elderly related words, such as bingo, grey or Florida, this will influence their behavior. More specifically, it will influence their walking speed. The idea being that elderly people walk more slowly. So, people who were primed with elderly related words would walk more slowly to an elevator than people in the control condition. If we take a look at the p-value distribution of the studies in the literature, we see that the p-value distribution is not uniformly distributed, it's not right-skewed. Now, actually, we see that a lot of these p-values are only barely significant. This is not an indication that the studies published in the literature provide support for the hypothesis. The hypothesis might still be true, but the data that we have available does not give us any reason to believe that this is a true effect. Now let's take a look at professor priming studies. In this line of research, people are asked to think about a professor, or in the control condition, they're asked to think about a group that's stereotypically seen as less intelligent, such as soccer hooligans. The idea in this study is that participants who thought about a professor are more likely to perform better on a general knowledge test, such as trivial pursuit, than people in the control condition. If we look at the p-value distribution, in this case, the distribution is very similar to that that we would expect when there is a true effect. Now this illustrates how useful it is to perform a p-curve analysis before you want to build on studies in the published literature. Let's say that you're interested in priming effects, would you build on elderly priming research, or would you build on professor priming research? Now, these p-curve analyses do not prove that professor priming is a real effect, and elderly priming is not a real effect. But if you're thinking about doing research on priming, and you have to decide between these two research areas, then it's a much better bet to build on professor priming related studies than elderly priming studies. You can also use p-curve analysis for small sets of p-values. You have to take a little care, when you have a small number of observations, you have low statistical power for any test, also for a p-curve analysis. But sometimes, the data is so clear that even with a small set of studies, you can prevent yourself from following up on reliable research. Let's take an example. In this publication, researchers look at the effect of reading literary fiction. On theory of mind, how much do you, for example, empathize with other people? The researchers performed five studies, and they provide a table with the critical test results for each study. We're always interested in the main effect of condition and I've highlighted the test results here. We can see that for the first study they observed a p-value of 0.01. For the second they have observed the p-value of 0.08. This test result will be ignored by the p-curve analysis because it only focuses on p-values smaller than 0.05. Now for the remaining three studies, we see p values of 0.04, 0.04, and 0.04. Let's perform a p-curve analysis on this data set. We can see that the p-value distribution is not as you would expect if there is a true effect. And indeed the statistical test tells us that the evidential value of this set of studies, if any, is inadequate. This does not mean that the theory is false, it could still be true but the data we have available in this publication does not allow us to conclude that the hypothesis is supported. The reason that I'm giving this example is not that I want to point out that some people publish sets of studies that are not reliable, it happens. This is very important for you to know that you can use p-curve analysis to identify sets of studies that are less reliable than others. When I was presenting about p-curve analysis in a workshop I taught, one of the PhD students in the room was performing a p-curve analysis on exactly this set of studies. This PhD student was trying to replicate these studies for over a year but had been unsuccessful. So I think it's very important to perform a p-curve of analysis before you try to build on studies in the literature. Because otherwise you might waste a year trying to build on research that is not as reliable as you think. Now remember when a p-curve analysis says that a set of study lacks evidential value, the theory might still be true. The data just don't provide evidence for the theory. P-curve analysis allows you to test whether a set of studies in the published literature yields a p-value distribution that looks more like the one you would expect when there is a true effect, than when there is no true effect. Using p-curve analysis before you decide to build on studies in the published literature might allow you to see how robust these findings are. And prevents you from building on unreliable results. [MUSIC]