In this video, we're going to focus on instrumental variables in observational studies. So in other videos, we talked about instrumental variables in terms of randomized trials where there's noncompliance. But here we're going to think about the use of instrumental variables in observational studies. So we're going to especially think critically about the validity of the IV assumptions in those settings and also go through a number of examples of IVs that have been used in practice. As an overview, we'll think about instrumental variables in observational studies. But it will look a lot like the sort of IV situation in randomized studies where we randomized treatment and we had some non-compliance. So here, we'll still use the same notation, so Z's an instrument, A is treatment, Y's an outcome. And potentially, you've also collected some covariates X. And we'll just generally think of Z as encouragement to receive treatment. So, we're not actually randomizing Z, but we're imagining Z as a variable that ends up encouraging people to either receive treatment or not. So if Z is binary, it's just encouragement yes or no. So people are either encouraged or not. Whereas, if Z is continuous, then we can think of that as a dose of encouragement. So for example, the larger Z is the more encouragement somebody would get to take treatment. Even though we're not actually going to randomized these, so an investigator we're not physically going to do the randomization ourself. We are going to think of Z as randomizing in nature, in some sense, as sort of a natural experiment. So the question is, so how do you find a variable like that? And so, the motivation might be, you have an observational study where you have some research question and you have data. But maybe you don't have very many confounding variables, maybe you have a small list of variables, and you're worried about unmeasured confounding. In that case, you really might want to try an instrumental variable analysis, but you'll need to identify an instrument. So as a reminder, we have to make a couple of assumptions. So one is that, the variable that you're proposing as an instrument has to affect treatment. And thankfully, we can check that with data. And we'll also have to assume that people have to make the exclusion restriction assumption. And that's going to have to largely rely on subject matter knowledge. And that assumption is where there's usually the most debate on whether something really is a valid instrument or not. And so in this video, we'll go through several examples that have been proposed in the literature. And we can think about what the reasoning was behind it and why people thought it was a valid instrument. So one instrument that's been used or proposed in the literature is calendar time. And so what happens in some cases, especially if you're looking at medications, for example, is that, or but, really, anything sort of in medical practice where there might be change in treatment preferences over some short period of time. So I have a sort of hypothetical kind of generic example here where in a picture where there's, basically, imagine there's two drugs that are used to treat the same condition. So Drug A, I drew that curve in red and Drug B, I drew that in blue. And on the horizontal axis we have here, so this is you can just think of this as time. And on the vertical axis we have the probability of treatment. So what you'll see is that, early on, so in the early years over here, Drug A is the one that's not very commonly used, but Drug B is preferred. So the probability of treatment is high for Drug B initially and low for Drug A initially. But over some period of time that changes, where now Drug A is very popular and Drug B is not popular. So this might happen if, for example, there's a new drug introduced in the market where, over some maybe relatively short period of time, Drug A goes from not being very commonly used to being commonly used. So there's a specific example that's been proposed in the literature, that looked at sulfonylureas versus metformin for treatment of diabetes. And so when metformin was introduced, you could think of that as Drug A. You could think of metformin as Drug A where, when it was first introduced, it wasn't used very often, but over time it became more popular. And so of course, at any given time, who gets Drug A versus Drug B? That is probably not random, it's probably dependent on patient characteristics. But potentially, calendar time could be something that you might think of as essentially randomizing in the sense that if you look at, I'm going to draw two vertical lines. Imagine this cross-section of time versus this cross-section of time. So I'll call this time 1 and time 2. So if you happen to be diagnosed with diabetes at time 1, you're probably more likely to get Drug B. Whereas, if at time 2 is when you were diagnosed with diabetes, you'll be more likely to get Drug A. And if time 1 and time 2 aren't very far apart, most other things in the world might be very similar, and it really might function very much as randomizing. This time itself might be very much like the drug was randomized to some degree. So early time period versus late time period has been proposed as a kind of thing that could function as an instrument. And here in this case, the body mass index was the outcome of interest because there was concern about weight gain, I think it was sulfonylureas in particular. So they were looking at, they wanted to know the causal effect of these different drugs on weight or BMI. But these treatments weren't actually randomized so they would just sort of given an actual practice. But could we use these time periods as an instrument? And so, is Z in the case a valid instrument of variable? Right, so that's the question. Well, it's certainly associated with treatment received, and that's very easy to show. I hand drew, essentially, a graph on the previous slide, but the sort of real life graph was not that different than that. There was a big change over time in the popularity of these two medications. And so it's very easy to show that yes, if Z is early versus late time periods, Z definitely affects treatment. But what about the exclusion restriction? So the exclusion restriction is saying that this calendar time variable should not be affecting the outcome. So here the outcome is BMI, body mass index. That should not be affecting the outcome directly. It should only affect it through it's impact on treatment. And so you could imagine that this is the one that there would be a lot of debate about, is calendar time valid? So if the early versus late period are relatively close in time, I think you have a stronger argument. But an argument against it would be that you could imagine that in the early period versus the late period, maybe other things have changed. So maybe how diabetes is treated in general has changed. So maybe the clinician more actively sort of encourages certain health behaviors than they did in the past. Or maybe more information about how you should sort of self-manage the disease has come out between those two time periods. So people with diabetes behave differently at the late period than they did in the early period. So there could be things that have changed. So this is one of the things where there would have to be a lot of thought and discussion about whether it's a valid instrument. So in general, whenever we identify a candidate IV, a lot of thought goes into this exclusion restriction. Is it a valid assumption? And, if you write a paper using an instrumental variable analysis, there'll be, certainly, reviewers who will be critical of the assumption. And it's important to look critically at these assumptions. But nevertheless, it seems like a good idea as far as it seems pretty likely that if calendar time has a big impact on treatment received and the time period is relatively short, that probably, the main thing that's changed in the calendar time is just the prescribing preferences. So it's probably at least a pretty good approximation to an instrument. But these things would have to get thought about and debated. So, another popular type of instrument has to do with distance, so distance as an instrument. So imagine you're interested in whether specialty care centers are better than some sort of general hospitals, some sort of none specialty care center. And you, of course, typically you're not going to randomized people to specialty care centers, you're not. So one idea that's been proposed is that, where these specialty care centers are located might matter. So for example, if you have a specialty care center right in your neighborhood, that's a very short drive, for example, or a short walk, you might be more likely to go there than if it was a long distance away. So having a specialty care center near you could be thought of as encouragement, encouragement to go there. And then there's this question of, well, is distance itself likely associated with the outcomes in other ways? So maybe people who live near specialty care centers differ from people who don't in other ways. And maybe, that's related to the outcome. So that is something that would have to get thought about. So as a specific example, one that's been proposed in the literature, one paper that's been published had to do with focus on differential travel time. So this was looking at, high level NICUs, so Neonatal Intensive Care Units versus a regular hospital. So let's say, ideally, you would go to a NICU, but you might instead just go to a regular hospital. And which one you go to might have a lot to do with travel time, because if you need to get to an NICU, you probably need to do so in a hurry. And differential time has to do with, well, you might be far away from both a hospital and a high level NICU, right? So it's really the differential travel time that arguably matters. If you're very close to a hospital and you're very far from a NICU, you might go to the hospital. But we wouldn't want to just use distance to a NICU as the instrument because you might be far away from both or you might be close to both. So it's a differential distance, so we actually can calculate, if you had somebody's zip code or if you had their address, you could calculate, you could figure out. You could use geo-coding and so on and find out how close they are to the nearest high level NICU, and also how close are they to the nearest regular hospital. Take the difference of those and use that as the instrument. So if the differential was big, for example, where you're much closer to a NICU, that's high encouragement. If you're much closer to a hospital, that's low encouragement, okay? So that seems like a reasonable thing to consider as an instrument, that certainly you would think would be related to where you actually end up going. So treatment itself A is delivery at a high level NICU versus a regular hospital. And then your outcome might be mortality. And so then, the debate would be about whether this differential travel time might be associated with outcomes, not through treatment itself. And then, there's many other examples that have been proposed in the literature. One that's been increasingly common recently has to do with Mendelian randomization. So the idea is that some genetic variant might be associated with a behavior like alcohol use but might not be associated with some outcome of interest. So there might be some genetic variant that's been shown to be associated with alcohol use, for example. It may be you're interested in the causal effect of alcohol use on some outcome. But alcohol use isn't randomized, but the genetic variant under Mandelian randomization assumptions is, and so sometimes that's used as an instrument. Another one that's been used is provider preference. This has to do with, in medication kinds of studies, where you want to compare, say, two drugs, your provider, your clinician might have preferences for one drug over another. They might not always prescribe one drug over another, but they might tend to prefer one over another. So the idea would be, for example, you could look at what they prescribed the previous patient, possibly the previous patient who was like you. Maybe the previous patient who is newly diagnosed with diabetes, for example. What did they prescribe that person, okay? So what they prescribe the previous person is probably correlated with what they'll prescribe you, because it would indicate something about their preference, right? So if they previously prescribed someone metformin, maybe they like metformin. So they might be more likely to prescribe you with metformin. So what probably is related to what you'll receive, but you could argue that what they did to someone else, what they prescribed someone else shouldn't directly affect any of your outcomes, right? It should only affect it through the treatment decision. So that's the argument that's made for provider preference as an IV. So that's been used quite a few times. And of course, there's a lot of debate about whether that's a valid instrument. Is the exclusion restriction violated? Maybe prescribers with certain preferences, maybe prescribers who prefer Drug A over Drug B tend to be better in other ways than providers that prefer Drug B over Drug A. So that has to get thought about and discussed. Another one that's been looked at is quarter of birth. So quarter of birth has been shown to be associated with how long you stay in school, how many years of schooling you have. And you can argue that quarter of birth is essentially randomized, and yet it's predictive of years in school. And so if you are interested in what is the impact of years in school on income, you might want to use quarter of birth as an instrument. Because we don't actually randomize years in school, directly. And in fact, how long you stay in school is probably very strongly associated with all kinds of things, socioeconomic status, how much education your parents had, all kinds of things like that. So years in school is far from randomized, but quarter of birth you could argue is essentially randomized. And so that's been proposed as an instrument for that kind of study. So these are just a few examples of instruments that have been used in practice. And you can imagine just from these examples that they usually involve some kind of clever idea about an instrument, what might be an instrument, but then there's also a lot of debate on whether it's valid. If we want to sort of think about how that relates to compliance, in other videos when we talked about randomized trials with noncompliance, we focused on trying to estimate a causal affect among compliers. Well, in observational study, what does it mean to be a complier? So just as a reminder, defiers are people who take treatment even if encouraged, and compliers are people who take treatment only when encouraged and so on. So here, what we mean by encouraged really has to do with that instrument that's used in practice. So it doesn't mean actual, when we talk about compliance here it doesn't mean actual compliance with what an investigator told people to do. But it does mean a compliance with encouragement. So if we look at, in the previous slide, provider preference, if we looked at what the prescriber prescribes to the previous person. Let's say that your prescriber prescribed the previous person metformin. Well, you would be considered a complier if you also received metformin, right? So we're keeping the same kind of language even though it doesn't apply quite as readily. But the main idea is that if the instrument says you were encouraged, if the instrument is such that you were encouraged, and you did get the treatment, we'll call that compliant. So same kind of idea, but applied in the observational kind of setting.