Learn fundamental concepts in data analysis and statistical inference, focusing on one and two independent samples.

Loading...

From the course by Johns Hopkins University

Mathematical Biostatistics Boot Camp 2

36 ratings

Johns Hopkins University

36 ratings

Learn fundamental concepts in data analysis and statistical inference, focusing on one and two independent samples.

From the lesson

Techniques

This module is a bit of a hodge podge of important techniques. It includes methods for discrete matched pairs data as well as some classical non-parametric methods.

- Brian Caffo, PhDProfessor, Biostatistics

Bloomberg School of Public Health

Zero.

Â Okay, so that's the nonparametric equivalent to the paired test.

Â Let's talk about the nonparametric equivalent to the unpaired test.

Â And here we have, we're comparing two measurement,

Â measurement techniques, again, from this wonderful book from Rice.

Â Mathematical statistics and data analysis.

Â at any rate.

Â They, they were comparing two measuring techniques.

Â And the units are in degrees

Â Celsius per gram.

Â and here we have a group measured with

Â method A and a group measured with method B.

Â And we want to kind, kind of test, are the, the measurements the same?

Â And we'll, we'll be a little more formal about the hypothesis in a minute.

Â but so let's, let's talk about how we can do that.

Â And so what, basically the method we're going to use is,

Â the, not to be confused with method A, method B.

Â The technique we're going to use for testing

Â whether the two methods are the same, is to kind of

Â take the AB labels and, and shuffle them on every measurement.

Â But because, to be nonparametric, we're going to shuffle them on the ranks.

Â But then we'll talk later on about

Â shuffling them on the observed values themselves.

Â That's the so-called permutation test.

Â Okay, so what we're going to do is test whether or

Â not the two measurements two treatments have the same location.

Â and what I mean is, kind of,

Â the distributions are centered at the same place.

Â we're going to assume that the measurements

Â are independently, independent and identically distributed.

Â have an independent identically distributed

Â errors, that are not necessarily normal.

Â So, there's a difference between the errors being

Â normally distributed, versus the measurements being normally distributed.

Â And that's one way to, so this is

Â the problem with this test is, that's one way to write out the assumptions.

Â Another way is to view this as a test of kind of a distributional shift, that, the

Â distribution for method B is kind of uniformly

Â shifted relative to that of, of, of distribution A.

Â And that's called a stochastic shift for the two arbitrary distributions.

Â So, you can either kind of specify the hypothesis kind of tightly.

Â That they're centered in the same location with IID

Â errors, and then you get the same test statistic and it

Â has a set of power for that particular collection of hypothesis.

Â Versus a very general one about a stochastic shift, and it

Â has a different kind of power for that set of hypothesis.

Â so all we're going to use, use is we're

Â going to disregard labels, method A, method B labels.

Â We're going to rank the observations, and then we're going to use the

Â sum of the ranks by discarding the within each treatment label.

Â And this

Â is called the Wil-, Wilcoxon rank sum test.

Â It's equivalent to the so-called Mann-Whitney test as, as well.

Â So, so you might call it, I don't know, the Wilcoxon, Mann Whitney test.

Â In R, it's wilcox.test.

Â And, and I should say that there, there are

Â some slight differences between the tests, depending on how

Â you, the tests work out to be the same,

Â but they characterize the test statistic in slightly different ways.

Â But it's, it's still, I think, correct to attribute

Â the test to Wilcoxon and Mann Whitney. Mann Whitney being two researchers.

Â so the procedure is to discard the treatment labels.

Â method A, method B in this case.

Â Rank the observations, without concern over which treatment they were.

Â calculate the sum of the ranks in the first treatment, which is arbitrary.

Â You could pick either the first or the second treatment, you get

Â the same value, but you have to pick one of the two.

Â And then you either compare your statistic with the asymptotic normal

Â distribution of the statistic, or you can, you can calculate the exact

Â distribution under the null hypothesis.

Â So here I show the ranks for method A, the ranks for method B, in

Â case the two observations are tied, we give

Â them the average rank and then move on.

Â the sum of the ranks for method A was 180,

Â and the sum of the ranks for method B was 151.

Â By the way, the sum has to add up to 231, by the way.

Â and let's, just because that's a fun result,

Â let's let's show why this is the case.

Â So, Gauss supposedly did this as a child.

Â there's some hypothesis that this story's apocryphal, but whatever.

Â Let's, for, for our purposes, let's assume he did it when he was a kid.

Â So, the story goes, is that his teacher asked

Â him to add up the numbers between 1 and 100.

Â And he went down and sat at his desk, and just came back with the answer.

Â And the teacher said, that's not possible, how did you do that?

Â And then he went and really did it and got the same answer.

Â Any rate

Â I think the story's probably apocryphal, but here

Â it's really a neat way to show it.

Â Is we could write x is the sum of the digits from

Â 1 to n, 1 plus 2 plus 3 and, in that way.

Â Or we could write it as n, plus n minus 1, plus n minus 2.

Â It's the same exact thing all the way down to 1.

Â So, if you add the two together, you get 2x equals, and in this case, notice

Â 1 plus n is n plus 1, 2 plus n minus 1 is n plus 1,

Â 3 plus n minus 2 is n plus 1, and so on.

Â And so this, so 2x is n times n plus 1, which is exactly what's happening here as

Â the number n plus 1 added up n times, so it's n times n plus 1, so then x has

Â to be n times n plus 1 over 2.

Â Okay, so, let's let W be the sum of the ranks for the first treatment.

Â And then, you know, if,

Â if a treatment has more numbers in it

Â then it's, under the null hypothesis it's going to have

Â a higher sum just by virtue of having more numbers, so we need to know NA and NB.

Â the number in each sample and it turns out that the expected value of the sum

Â of the ranks under the null hypothesis from

Â the first group works out to be this guy.

Â Na times nA plus nB plus 1 divided by 2

Â and so one with a standard error given by this guy.

Â And then we could create a test statistic which is our

Â W, our sum of our ranks in our first

Â group minus it's expected value divided by its standard error.

Â Turns out to be normal 01, of course you

Â can calculate the exact distribution as we described before.

Â Okay, so let's go through our example, in this case our, our sum of our

Â ranks was 51, if we did method B, our sum of our ranks was 51.

Â Here's our expected value and standard, standard

Â deviation of that statistic, 88 and 13.

Â Our test statistic works out to be negative 2.68.

Â P value of 0.007 for two sided.

Â and then, you can also do the function wilcox.test, and it'll

Â perform the, it'll perform the test to you.

Â Both for the one and two sample version,

Â you have to read over the documentation for wilcox.test.

Â If you give it one vector, it's going to do the sign rank

Â test You give it two vectors, it's going to do the rank sum test.

Â So, some final notes about the nonparametric test is, they

Â tend to be more robust outliers than their parametric counterparts.

Â They do not require normality assumptions.

Â They often have exact small sample versions.

Â And their trick, their big trick is to

Â focus on the ranks rather than the raw data.

Â you, there is some loss in power of their parametric counter parts.

Â assuming the parametric assumptions

Â are met, but the loss in power is often not so bad.

Â and then I just want to

Â emphasize, nonparametric tests are not assumption free.

Â They can, they're often distribution free.

Â for example, the sign rank test, you really

Â kind of have to assume that the distribution is symmetric.

Â but either way, in all of the tests that we considered,

Â you have to have a sampling model, that the data's IID, right?

Â That, that's an assumption, a big assumption.

Â The biggest assumption.

Â so, it's, you know, so just to emphasize

Â that nonparametric tests are, are not assumption free.

Â They're they're, they're often distribution

Â free, but not assumption free.

Â And then, so I just wanted to remind

Â people about permutation tests, because it's, we've already

Â talked about them a little bit sort of in, with regard to, to Fisher's exact test.

Â But here we could also talk about them in general.

Â so permutation tests are similar to these rank sum tests,

Â though they use the data rather than the actual ranks.

Â So, under the null hypothesis for the rank sum test, we had the collection of ranks.

Â And,

Â and our null distribution was just obtained by

Â permuting the treatment labels, you know, we had NA

Â treatment A labels and NB treatment B labels, and

Â we just permute those with respect to the ranks.

Â That would retain NA labels and NB labels,

Â but they would be randomly allocated among the ranks.

Â A permutation test is exactly the same thing.

Â You're just doing it to the raw data rather than the actual

Â ranks, and you have to come up with the statistics.

Â and I go through the procedure here.

Â you know, you could permute the ranks and then create a rank statistic.

Â I, I, I also want to

Â well, I, I want to distinguish, there's two ways to think about this.

Â One is, imagine if your treatment was actually randomized.

Â Then you can think of the permutation

Â test as actually kind of redoing the randomization.

Â And in that case, the permutation test is called

Â a randomization test if, if you interpret it that way.

Â but you can also kind of perform the

Â permutation test even if the treatment wasn't randomized.

Â Because you're thinking along the lines of well,

Â my null hypothesis is that my labels A and B are exchangeable between the groups.

Â so either way it kind of make sense.

Â but it, it changes your interpretation a little

Â bit of the, of the test in either way.

Â But at any rate, the Fisher's exact test, the rank sum test, Fisher's exact test,

Â which works on collections of binary data, Fish-,

Â the rank sum test, which works on ranking

Â the observations.

Â And then the permutation tests all have the same basic principle.

Â Is that, under the null hypothesis, however we're interpreting it, is that

Â treatment labels are the NA and NB treatment labels, are exchangeable.

Â And our null distribution is obtained by

Â permuting those labels across the values in, in

Â Fisher's exact test, and the permutation test,

Â we're permuting them across the actual observed values.

Â And then in the rank sum test, we're converting the

Â data to ranks first, and then permuting across the ranks.

Â So just to reiterate, this is an easy way

Â to produce a null distribution for test of equal distribution.

Â It has kind of a similar flavor to the bootstrap, maybe not exactly.

Â this, this produces an exact test.

Â it's less robust but more powerful than, than, than

Â rank sum tests, because you're not throwing away the data.

Â With rank sum tests, you throw away the actual units

Â and you go to with ranks so you gain robustness

Â at the expense of power.

Â This, you get a little bit more power under certain assumptions.

Â but you lose some of that robustness.

Â It, it's very popular in, in large scale, big

Â data applications, like genomics, for example, and neuro imaging.

Â So this final picture is just what you

Â would aspire to get from a permutation test.

Â You would permute method A, method B labels.

Â You would say, calculate the T statistic as

Â if the permuted labels were the observed labels.

Â And you'd do that over and over again.

Â And you get a null distribution of T statistics, right?

Â And then you, this vertical line is where

Â our absolute, our, our actual T statistic occurred.

Â And then,

Â if that's the case, then you or whatever statistic, it doesn't have

Â to be a T statistic, but that's a reasonable statistic to do.

Â And then the percentage of the simulated statistics that are

Â more extreme than our observed statistic is our exact P value.

Â So that's a, that's a permutation test.

Â if you were to do it with the ranks,

Â then this would just be simulating the, the, the exact

Â small sample distribution of the rank sum statistic.

Â If you're do it with a raw data, then it's a so-called permutation test, and so on.

Â Coursera provides universal access to the worldâ€™s best education,
partnering with top universities and organizations to offer courses online.