0:02

Okay, so let's take a look at marrying

Â the strategic formation models we have been looking

Â at with some of the earlier types of

Â models that we had for estimating networks, random networks.

Â And in particular we'll look at sub graph generation model SUGM, and try and figure

Â out how we mgiht fit in some of

Â the utility based calculations we've been looking at.

Â Okay so we've got utility from forming subgraphs,

Â links, triangles, etcetera, but what we're going to do is, is

Â noise that up by putting in some randomness in the utility.

Â So let's have a look on how we might do that,

Â and let's do this in a context of a specific example.

Â So what we are going to do is try and

Â ascertain whether or not when we look at caste relationships,

Â is there some sort of social pressure that's operating on

Â these and not surprisingly we might find that there is.

Â And in particular are,

Â when we, when we look at cross caste relationships.

Â When we look at, at caste relationships that go

Â across boundaries, are they more likely to occur in private?

Â When people have no friends in common, or do they occur with the same frequency

Â when people have a friend in common as when they don't have a friend in common.

Â Okay, so that's one of the questions we might ask.

Â So let's look back at some of the data we had from this work

Â with, Abhijit Banerjee, Chandrasekhar and Estu du Flur.

Â So this is

Â 1:24

village 26 again, kerosene-rice sharing, and what we're

Â looking at here, then, is again we've, we've colored the nodes by a,

Â just a dichotomous cast look, so schedule caste and schedule tribes are blue.

Â General and otherwise backward castes are red.

Â So what we've got is we see that,

Â you know, there's fewer relationships going across the

Â boundaries of, of this designation than within.

Â So we saw the, the probability of going

Â across was 0.006, the probability within was 0.009.

Â But we could you know, we can look at different, here's Village48.

Â A different ne, sub-network.

Â Who visits which household? Which other household socially?

Â It's a denser network but we see similar patterns in terms of the segregation.

Â And so what we want to ask here is, let's look

Â at, at say somebody from the red, some from the blue categorization.

Â Do we see this kind of relationship, where they have a friend

Â in common, less, relatively less frequently

Â than we see this kind of relationship?

Â So, do people have, prefer to form these things in private

Â rather than in situations where there's going to be some sort of witness?

Â To the interaction.

Â Okay, so one difficulty in beginning to estimate this kind of thing is

Â the fact that triads are going to take, triangles are going to take

Â three people to agree to form, whereas links are going to take only

Â two And so naturally it's going to be more difficult to get triangles to form.

Â And so we're going to get a bias that, that makes these

Â things relatively less likely and then if we make any particular

Â link less likely across, then these things might have a

Â lower likelihood just because we're working with threes, rather than twos.

Â And so what we want to do is account for preferences explicitly.

Â Otherwise we're naturally going to find that the

Â less diesired triads compared to more desired

Â triads is going to look less than less

Â desired dyads compared to more desired dyads.

Â So there's going to be a bias there unless we, ac, ac, account for this carefully.

Â So how are we going to do this?

Â So let's build

Â preferences into this.

Â And then look at a sub-graph generation model, and

Â then try and figure out what the, how the

Â probability of link forming depends on the likelihood that

Â the pair meets and, and both wishing to form it.

Â Okay?

Â So generally we can think of this as saying there's

Â characteristics that i has, say in this case their caste designation.

Â And there's going to be utility that they get from forming a link.

Â Based on their characteristics and

Â the other person's characteristics and then

Â there is some, something either unobserved

Â or some personality or something else which then also affects that utility.

Â So we'll put it in error terms.

Â We subtracted off something.

Â Which could be negative, it could be positive, so maybe

Â it's a boost but there is some random element here.

Â 4:16

And i benefits from the link.

Â Yeah, if and only if, this error is less than this utility.

Â Okay, so if the error's

Â less in magnitude than the utility, then this term's going to be positive.

Â And you're going to want to form that link, and otherwise you're not going to.

Â Okay, so we have a very simple preference based model.

Â Now we're going to try and fit that in to a sub graph generation model.

Â So how we going to do that?

Â Well, under pairwise stability the links are going to form if and only if both of

Â these two prefer it assuming that the,

Â the chance they're getting exactly a zero utility

Â is, is zero.

Â So now we've got that links form if and only if i

Â prefers to form a link and j prefers to form a link.

Â So the error that j gets from forming a link with i is less than

Â the utility that j gets from forming a link with i, and, and so forth, okay?

Â So links are going to form on both of this things are true, and so if we have

Â some distribution of what the error terms look

Â like, then the probability that a given links forms

Â is going to be proportional to the probability

Â that their error is less than i's

Â utility and the probability, times the probability

Â that, that error is less than j's utility.

Â So has to be that both of them prefer it.

Â So when we take this product that will give us the

Â product that, that chance that both of these people prefer it.

Â What's the chance that both prefer it is the product of two, okay.

Â 5:49

a, the noise in the chance that j likes i is

Â isn't dependent with the noise that i gets from the same relationship.

Â Okay.

Â Now you can do the same thing with triangles.

Â What's going to happen is now we going to have triangles depending on

Â the three people's characteristics and then we'll have multiply it three times.

Â One for i, j and k, okay?

Â So it's exactly the same kind of ideas and principles so we could generate any

Â kind of sub graph. By doing the same technique.

Â Right?

Â Putting in utilities for different sub-graph forms

Â depending on the characteristics of the individuals involved,

Â and then probabilities that people are actually going

Â to have errors that are less than that.

Â And, and that gives us some distribution here.

Â Okay.

Â So now let's go ahead and, and try and look at

Â how we would use this kind of model to estimate something.

Â So what's

Â the null hypothesis?

Â So if we think that there's no social

Â pressure, then we think that a given person's

Â preference for having an across, to being involved

Â in across caste triad, compared to within caste triad.

Â 6:54

Is the same as whether they prefer across

Â caste link compared to a within caste link, okay?

Â So what we're allowing them is to care about caste but what we're saying is

Â they don't care the probability that they prefer something across caste in terms

Â of a triad is the same as their relative preference for that within a link.

Â And instead of a triangle.

Â Okay, so that's the null hypothesis that we have.

Â So now we can just go in and say, okay well what's our,

Â our model said that the frequency of cross caste triads compared to within caste

Â triads is going to look like this ratio.

Â Of utilities if we just assume now that

Â everybody has a similar utility function that either varies,

Â am I going across or within and then we

Â just get you syncratic noise on the particular relationships.

Â So then we have got a cube of the cross caste triads

Â compared to within and square on the cross links compared to within.

Â So now

Â we are correcting from the fact that triads are harder to form.

Â 8:09

So what's the probability that I prefer this?

Â Well, this is going to be the cubic.

Â Right, we'll just take a cube root of the of the relative frequencies.

Â Right, so we can then just correct the

Â probability that preferred to form across is just going to

Â be the frequency to the 1 3rd, crossed for links is going to be to the 1 half.

Â Okay.

Â So now, if under the null hypothesis, these

Â two things should be the same, that tells

Â us that these frequencies in the data should

Â be the same, if the null hypothesis is correct.

Â Okay?

Â So what we can do is plot what's the frequency of cross caste

Â triads to compared to within to the one third power, look

Â at that compared to the links to the one half power.

Â And these things should be the same

Â under a hypothesis that social pressure doesn't matter.

Â And if they're different, then we can figure out which one,

Â you know, is, does social pressure encourage it or discourage it?

Â So if this number is, if the top

Â number is less, then we're seeing discouragement based

Â on the social pressure.

Â And if it's more, then we would see encouragement.

Â Okay? So let's plot these out.

Â Here's links down here. This ratio.

Â 9:27

And this should be on the 45 degree line under

Â that null hypothesis, so this is the ratio of triangles.

Â This is the link ratio raised to the

Â three halves to correct for the three versus two.

Â And now when we look at these, they should all line up on the 45

Â degree line or half above and half below

Â and these are for the 75 different villages.

Â And indeed, we see that there are more winding up below.

Â And if you do a statistic test of just looking.

Â So one conservative test in this world.

Â Is that if the null hypothesis were true, then you ought to have a coin flip

Â as to whether a village ends up on one side or the other side of these line.

Â In fact, when you do that the preponderance of villages end up below the

Â line, and this is going to be statistical significant up to 99.99% level or more.

Â 10:15

One interesting thing you can do here is

Â you can actually also then sub divide these villages by how integrated

Â they are in terms of, or how balanced they are in terms of the caste designations.

Â So, some of these villages would be 50% red, 50% blue, in terms of

Â those different measures we had of the

Â scheduled caste, scheduled tribe, versus general and otherwise

Â backward castes.

Â So some of them split halfway down the middle,

Â so you have two di-, you know, people evenly matched.

Â Others are say 90% to 10% or 95% to 5%.

Â So there'll be a big majority of, of one

Â caste group and a small minority of another caste group.

Â And so what we can do is look at how balanced the groups are.

Â So let's look at the relative size

Â of the, of how big the minority is compared to the majority.

Â And if the minority is above median size, then that gets a light blue.

Â So these ones down here.

Â We can see that most of them end up pretty far below.

Â There's only a couple of them that end up anywhere above the 45 degree line.

Â Most of them are ending up below.

Â Whereas the reds

Â are the ones where there's a little more imbalance,

Â so that the smaller caste are, are more minority.

Â And this actually, now you find that those, that reds

Â actually are, are a bit closer to the 45 degree line.

Â So the more skewed the village is,

Â actually you find less of the, the pressure.

Â 11:53

According to the statistics, whereas here if you've

Â got a very well-balanced village then the castes

Â seem to separate more, in particular under the

Â triangles you see even more pressure to separate.

Â So this is actually something that you see in

Â different data sets is that the more balance things are.

Â The more tension there can be in forming cross group ties, and

Â in particular here we see that the relative ratio of triangles to in,

Â in cross caste compared to links you see that the, it's more, more

Â often that you get links compared to triangles in this kind of setting.

Â So this is just you know, one illustration of

Â how we might begin to marry these kinds of models.

Â But what it does to show us we can

Â use preferences together with other kinds of, of statistical

Â models to begin to estimates some of this models

Â and see what's going on in some of the data.

Â Get a little bit of a lens,

Â its hard to do, interpret this closely.

Â But at least we can figure out whether there are certain patterns in

Â a data, and here there are patterns among the triangles and the links.

Â So we reject the null hypothesis based on the model people

Â show us significantly stronger preference in terms of what we estimate.

Â Now whether or not they truly have those preference depends on whether the model is

Â correct, the model is a little bit simple

Â here more for the, pur, purposes of illustration.

Â But we can begin to build richer

Â models that take more into account and see whether this finding holds up to those.

Â