So this is the 2013 Science paper and, and this is what I

did here is I pulled out one of the villages, village 26 in particular.

And here the nodes are, are color-coded by caste.

And in particular, the blue nodes are what are known as scheduled castes and

scheduled tribes, so these are castes that

are under affirmative action by the Indian government.

The reds are general castes and otherwise backward castes.

So these are not affirmative action castes.

So.

This is a differences in caste, and then we could look here,

and we could say okay, let's do the very simple block model.

We're just going to see what's the probability of linking with somebody of

your same caste compared to linking with somebody of a different caste.

So here we can go through and just directly count how

many links there are among members of the same caste and

going across castes, where caste here is

just this dichotomous version actually, the particular caste.

There is quite a number of castes in these villages.

So what do we end up with, if you

do the count here the probability of having a link.

Between a red and a blue node in this graph is 0.006, the probability

of having a link between either two reds or two blues is 0.089.

So more than ten times more likely to have a link within the same

caste as a caste designation in this case, as oppose to going across this boundary.

And so we see that it indeed there is a

substantial homophily based on this kind of simple block model.

And that block model then allows us to, to, keep track of these things.

And if we hadn't colored these nodes, it's

not so obviously that we would see that there

is a, a strong dichotomy here as once we

start keeping track of this and then estimating directly.

Now, we could have not had the pictures of the, of

the colors here, and we could have tried to discover that.

That's known as community detection or clustering algorithms.

So there's ways in which we could begin to look for things.

And we might begin to say, okay, look actually this,

even the blues look like they're segmented into different groups.

We could try and, and fit,

and see whether there's additional blocks, and whether there is more density in here

than over here and your algorithms and techniques people use for doing that.

So what's the important aspect that's missed from block models?

Well, the likelihood does depend on node attributes, either observed or, or latent.

But in practice. Often that the probability that that two

nodes have been interacting, so people would been interacting

could depend on whether they have friends in common.

So there is real social structure to it.

So let's you know, the fact that, that A and

B both have a common friend C makes it more

likely that they are going to be linked to each

other than if, if they didn't have a friend in common.

And so that could be an aspect, which independently of their

characteristics, people tend to meet people through other, through friends or

have reasons for spending time together in,

in triples and so forth, and that's going

to lead to extra things, so we're going to need richer models to capture that.

So next up we'll be talking about classes of models

that allow us to keep track of these explicit dependencies.

Which is going to make our life a little

more difficult statistically, because things aren't independent any longer,

but will allow us to capture a lot

of things in networks that are going to be important.