So now we're going to talk about another class of models which are easy to estimate and related to these exponential random graph models and SERGMs, and these are called SUGMs. So we're very inventive in our names. So what, what the idea here is again we're going to be looking at, at ways to easily incorporate dependencies and networks and think about different kinds of sub graphs that might be forming and allow those to be generated and try and estimate how they're being generated. And the idea of SUGMs the, the, the terminology is quite simple. It's from subgraph generation model. So we think of subgraphs as being generated. Links, triangles small stars etcetera. Those are the things that are going to be generated, and then the network is a byproduct, right? So what's happening is people are forming different types of relationships, different kinds of cliques, and the network bubbles up from that. Okay? So in particular, what we're going to think of is think of whatever type of, of subnetwork you like. So, maybe it's a link. Maybe it's a triangle and what happens is these things will be independently with some probability p, p sub j and then we end up with some number of those. So, say s sub j, sub-networks. So our 45 links or ten triangles, these things are going to be generated and, the difficulty is that they may overlap. Okay? So, sometimes people form links. They also form a clique. So, I end up partnering with somebody but I also form groups. So if you wanted to look at say a co-authorship network, I might have written with one person. And just one paper but then we also collaborated with somebody else on a different paper, so we end up forming both the link and the triangle and if we just look at the co-authorship network then it's hard to see that directly in the data. So we observed the resulting network. And then what we're trying to do is infer these different probabilities of different types of relationships they came out, okay? So let's just go through a simple example. We start with a bunch of nodes. Let's think of just generating links and triangles. So there's some probability that different links form. So in this case, a bunch of links form here, nine different sorry, triangles form. So nine different triangles are formed at random. So out of all the possible triangles, we form nine of them. Then a bunch of links formed. So the links were dropped down afterwards. And, what do we end up with? We end up incidentally generating some triangles, right? So when we look at this, there was these two edges had already been there because of a triang, of triangles that were generated. This one was generated as a link and it ends up generating a triangle and when we actually see it in the end we can't tell which ones were generated in which way. So what we're left with is some links, some triangles and we're trying to estimate then what is the probability of these different types of sub-graphs being formed. Okay. So, that's the idea So these are sub-graph generation models, so first of all let's try and see is there a way to view these sub-graph generation models as statistical exponential random graph models? Now you remember when we went through exponential random graph models to begin with, I showed you how you could take a a model of, of just an [INAUDIBLE] type where you're generating links at random, and you can represent that as an exponential function. So you can do the same with just triangles and links. And it's just going to sum up. So you can go through that same kind of exercise. And it, it shows you then there's actually theorem this is from the paper with Arun. So if you have a sub graph generation model with parameters p sub j. Let s be the true count of how many links were formed and how many different triangles are formed and so forth. So let that be the vector of true counts of subgraphs true is in italics here because we're not going to actually observe the true number, we're going to have to infer that. But then the probability that you would see that would be proportional to An exponential family. So, it would fall under the SERGM family. And, in particular, the betas would actually be the log odds ratios. And, the K's would be how many different configurations you could form of having exactly a particular count s sub j. So, the fact that, you know, how many ways could you get a certain number of of triangles out of how many different triangles there were. Okay? So and then you take that product and that's what the actual K is. So here, the K's look different than the n's we had in the exponential random graph model. But if we just formed these networks, sub graphs at random and then percolate up a network. Then that gives rise to a specific form in the expedential family and it has very particular parameters and very particular weights. So, what's the difficulty here? Well, the difficulty is that the true counts of exactly how many triangles are formed and how many links are formed aren't exactly observable. But we can get close to estimating them and observing them when things are sparse. So, if you noticed in our previous example, we had only one. So given that there weren't too many links and too many triangles there, we only ended up with one incidental. So the incidental was, were pretty rare. It's not that often that you actually get that the, the links fall down in a way that accidentally generate triangles and, and so forth. And so if the things are sparse than we can actually use the particulars to, to estimate these things. So what we'll do next in the next video is talk about try to estimate these kinds of, of models.