Now, we're going to take up Estimation of Mediated Effects.

So if you haven't reviewed the previous few lessons.

Go ahead and do so now, so that make sure we're all on the same page.

Now, the estimation is straight forward when you can use.

Off the shelf software in conjunction with models.

And so, the most common case, we have already looked at it a bit.

Is the linear structural equation models with continuous outcomes M and Y.

Extended here to allow for the inclusion of covariates X.

And also because we can do it pretty simply,

interaction between the mediator and the treatment.

I'm going to begin with the model for the potential outcomes.

And compare it to the model for the observed outcomes.

You should be familiar with this strategy by now.

So for z = 0, no treatment and 1 treatment.

And for all M that you can get in the range of M,

here's the model in terms of potential outcomes.

Below, one equation for M and the next equation for z.

And we'll make the usual kinds of assumptions on the errors

with the potential outcomes.

That allow us to identify the model parameters and

equate them with the average.

Treatment effects, and

those kinds of things that we're interesting in estimating, okay?

So as it say, the model parameters are denoted with the superscript C.

To indicate these are the parameters of the model for the potential outcomes.

And when we get to the model for the observed data, the Cs will come off.

Now, I'm going to use the model for the second equation for the outcome.

And define the control direct effects, okay?

So you can see, this is just very straightforward Algebra.

And, of course, the errors have zero mean conditioned on X,

we've assumed that, right?

All right, so you get this term, this alpha Y.

Which has to do with the difference between treatments.

And you get this beta Y, which has to do with the difference between the mediators.

And then, you get this interaction term, the delta Y for the interaction.

Alright, let's move forward.

With Z and M interacting, the direct and indirect effects depend on.

Which of the two previously discussed decomposition, so

the total effect that we used.

Remember, when they don't interact when we have add activity.

The two decompositions give the same result.

We saw that earlier.

In any case, the effects may be derived using this equation for

the potential outcome Y, I, Z.

So that's just when you take Z, when you set Z.

And then, you set Z star to be either Z or not Z.

And the actual value of the mediator at that point.

So now, we get this equation.

You can just do the algebra.

And under the conditions for the identification of average direct.

And indirect effects that we gave previously,

we get the following two results below.

Okay, and of course, we need this expectation of the air is to be 0.

We've seen that kind of thing before.

Using these identification conditions

analogous parameters in the observed data model.

Okay, you see that's the same, sort of, model, but with the Cs taken off.

And a different assumption about the errors because we're conditioning on Z

as well.

These are the, sort of, the regression errors.

And anyways when we have the identification conditions holding.

The model parameters in the observed data model.

And the potential outcomes model are the same.

So that we can use the observed data model to estimate the direct and

indirect effects.

And controlled direct effects, of course, as well.

Which require actually some what weaker conditions.

We can estimate all these guys using the coefficients in the observed data model.

The estimated observed data model.

Okay, for the remainder of the lesson.

I'm going to assume that the observations are independently,

identically distributed.

So then the model, the observed data model.

I'm going to estimate by applying ordinary least squares regression to each equation.

And then, the delta method can be used to obtain standard errors for

the estimated effects.

If you're not familiar with the delta method, look at it in the elementary.

Sort of, probability or statistical inference text.

One that, of course, assumes you've had calculus, at least.

Now, if the mediator is continuous but the outcome y is binary.

You probably wouldn't want to use a structural equation model for

continuous outcome y.

So it seems natural to use logistic regression model.

Or, probit model to estimate the outcome probabilities.

Okay, so remember we're focusing on the use of off-the-shelf software.

To estimate direct and in direct effects.

So if you use the logistic model with the mediator.

Following the normal distribution, least conditional.

Then this paper by VanderWeele and Vansteelandt give formula.

In terms of model parameters for average direct and indirect effects.

But the effects are now defined in terms of odd ratios, which is, kind of,

a common thing.

That's often done.

Some people don't like it, and some people think it's fine.

Similarly, when the mediator is conditionally normally distributed with

constant variance.

And the outcome is a survival time model using the accelerated failure time model.

Which is a special model, then the VanderWeele gives formulae.

In terms of model parameters for differences in logged outcome means.

Okay, that said,

the differences cannot be expressed as a difference of units effects.

For the outcome Y or some transformation of Y.

Now, so much for use of off-the-shelf software.

For estimating these average direct and indirect effects.

So more generally, Hong has proposed a general method for estimating.

Average direct and indirect effects, based on an extension of the weighting approach.

We saw on part one.

Remember, the Inverse probability treatment weighting.

Okay, now, the method does not require modeling the outcome Y.

Which is a point Hong emphasizes.

But it does require modeling the mediator M and the treatment assignment process.

So Hong shows that under the identification assumptions given in

the previous lesson.

You get this for the expected value of Y of little z, the potential outcome Z.

And then, when M takes on the value that it would take on.

If the treatment assignment variable were set to z star.

Okay, so now, another approach that doesn't use off-the-shelf software.

You might have proposed several approaches using simulation.

Both of their approaches require modeling both the outcome and the mediator.

When one of these approaches uses a sampling distribution

of the model coefficients.

The other approach, which can be used more generally.

Begins by drawing bootstrap samples, okay?

Of the original data.

Now, for each bootstrap sample, the models for the outcome and

the mediator are then fit.

Then, for each unit, i.

And for each possible value z of the treatment.

K values values of the mediator are simulated under the model.

And these simulated models, the s up there is for simulated.

The simulated values of the intermediate outcome, the mediator.

Are used in turn to simulate corresponding outcome values.

Okay, as you can see there.

Then, the outcome values for the units are averaged.

And then, they are averaged over the units, and that gives the estimated.

Average, direct and indirect effects for bootstrap sample little l.

The estimates from from the L bootstrap samples may then be used.

To compute means, confidence intervals, etc.