So in this section I want to talk about Rank Sums.

Now, you'll notice in the literature or

in textbooks that there is more than one way to go about doing the rank sums,

and it ends up with different tables, if you do look up the p-value in the tables,

the mathematics behind the statistics is exactly the same.

One method just goes further, just one step further than the other,

when it calculates those tables.

The results are comparable though.

So what do we do?

Now imagine we have two groups of patients there, we have group A and group B and

imagine these are just scores for, for instance depression.

And we note patients patients in group A marked a 7, a 4, a 9, and an 18.

And patients in group B marked a 11, 5, 20, and 14.

Now with rank sums we have to rank them and we do them in order there and

you see the second little table there, so the 4 would be the smallest value up until

the 20, the largest value, and we just put them in whichever group they do belong to.

Now the first method I'm going to discuss is just a who beats who kind of method,

it's matches.

They've been matched up with each other.

Think about tennis matches.

Who's gonna beat who?

But let's have a look.

So I can write a little recipe there from the smallest value to the largest value,

pertaining to what group the patients belong to.

So the four came from group A and the 5 came from group B and 7 and

9 both from group A.

So it's going to go A B A A B B A B.

Now you can well imagine, there's many ways to do two different groups,

take two sample sets and those As and Bs will be in different orders.

And if we think we've got eight values there,

there are actually 70 ways that you could write various combinations of A and B.

From AAAA, BBBB.

All the way to BBBB, AAAA.

And you can well imagine that any group analysis that you might do,

any sample that you might have might fall somewhere, somewhere in that range.

Now look what we do here.

Here's what we get to the matches.

I've still got my same table there the 4 coming from the A,

the 5 coming from the B, and now we just look at the 4.

And we look towards the left and this is the first little tennis match.

Does the 4 beat anyone in group B?

No it doesn't.

It is the lowest number.

So it gets a point of 0 there.

We move to the next number on the rank.

That'd be five, which is in group B.

And we see how many does it beat.

Now by how many does it beat, I mean how many is less than it.

So there's only one less value that it in the A group and that is four, so

it's just gonna beat one value and it gets a point of 1 there.

Now we go to the 7,

how many in the other group that belong in how many in the other group does it beat?

Well it beats 1 and 5, so it gets 1.

The 9 also only beats one which is the 5, so it gets 1.

Now we go to the 11 in B and we see that it beats one, two, three values.

And 1, 2, 3, values are lessened at the 4, the 7 and the 9 so it gets a 3.

The same goes for the 14 and then you can see we can unriddled the end there.

And we sum up all of those results, so all these little matches, and we see for

group A we get a sum of 5 and for group B we get a sum of 11.

Now there are 4 in each group.

4 times 4 that is 16.

So we actually had 16 little matches.

And you can imagine that we can get a variety of point scores.

Now, we usually going to go to the first group or the smallest group,

we see the five, there and we now look at all those ABs, ABs,

ABBAB, how many of them can make up a five?

Now, certain values are going to have more ways to them up, and

that's our little distribution.

Some of these values are going to have very few ways of making them up.

So they are going to occur less commonly,

so therefore they might fall in that area of statistical significance.

Now it's very easy, we can either let the computer calculate a p-value for us or

we can just use a table.

And there will be specific tables pertaining to this method of doing it.

But it's, I think, quite interesting to understand that some little recipes of ABs

are going to occur less commonly than others, or

at least making up the rank sum values that you can get to to these venues.

Now the second method is a bit different.

There we just rank them.

We see the 4, 5, 7, and 9.

And, we just give them a point for

where they stand in the rank from the lowest value to the highest value.

So, 1,2,3,4,5, we add them up, we say 15 for group A, and

we see the 21 for group B.

Those values we can just usually use the 15 or the first group again, and

we look that up in a table according to how many patients are in each group.

You'll have a axis of trust horizontal,

an axis down vertical with number of patients in those group.

And you just go with those two combined, and

it'll give you a lower and upper value.

And if your smallest value then falls outside of that for a two-tail test or

more than or less than for one-tail test it'll be statistically significant.

So that's a different table that you do find that.

Problem is those tables can then become quite big.

They actually usually only go to ten on each side.

And what happens from there on, with more than ten,

you actually start following a z distribution.

And there is a little calculation to take this rank sum value to a z value,

which you can use in the calculation.

So that is rank sum set, two ways to go about it, two sets of tables.

Two ways that the computer can do this, you're gonna get the same result but

it's all about what differences between those can occur more

commonly than others would have occurred exactly what we have discussed before.