So, again, not a scientifically applicable or useful

quantity because we can't have persons with HDL zero.

But, this is a necessary pieces to the resulting equation

either in the slope form or the exponentiated form and will be

used when estimating the odds and probabilities of obesity

for any single group of adults given their sex and HDL levels.

We'll look at doing that in a subsequent section here as well.

So, if we wanted to actually present the results that we had looked at from

unadjusted and compare it to adjusted, would do something like this.

So, I'm actually going to have age quartiles in

this table too and I'll speak to that in a bit.

I'll be bringing the age quartiles in above and beyond sex and HDL in the next table.

But, let's focus here on the unadjusted associations between sex,

obesity, sex and HDL and

then the model we just looked at where they were adjusted for each other.

So, I have a column here that says unadjusted and I present

unadjusted associations between obesity,

sex, HDL, and age.

Then, I have a column here that says adjusted,

but since the only entries in the column or for sex and HDL,

there's no piece for age,

the implication if you solve this in

a journal article would be that the final adjusted model they presented,

only included sex and HDL.

So, let's look what happened to the relationship between obesity and sex.

It was positive and statistically significant in the sense that

females had higher odds before adjusting for HDL.

We also saw that females have higher HDL than males and

a higher HDL is associated with lower odds of obesity.

So, this when we adjust for the differential levels of HDL between females and males,

the association actually gets larger for females because we're no longer

mixing disproportionate number of

persons in the comparison who have lower HDL among the females,

which would bring down the odds of obesity.

So, when we take that out the odds ratio of obesity for females to males gets larger.

In estimate it goes from 1.44 to 2.15 and the confidence interval shifts up.

You can see that these new overlap and these two confidence intervals,

so it appears that there was real confounding here going on between sex and HDL.

Let me just diagram when I mean the unadjusted comparison between females and males.

If I'm going to put just for the moment to make it easier represent

a low HDL and high HDL and I'm going to indicate

that with the letters L and H. We see that females had higher HDL,

so there'll be a higher proportion of females with high HDL.

Just draw a few here than low compared to the males who would have

a lower proportion with high HDL because they tend to have lower and so the gray.

So, some of this comparison was being distorted in the and

attenuated because the females were more likely to have high HDL,

which was associated with lower odds of obesity,

and that's why this unadjusted estimate comparing females

to males is lower in value than the one

that was adjusted when we removed that differential distribution of HDL levels,

higher HDL levels among females disproportionately in our adjusted comparison.

Interestingly enough, the relationship between obesity and HDL didn't change so much,

but it did attenuate slightly,

went down a bit and actually even though their numbers are close in value,

the confidence intervals do separate and do not overlap between these two groups.

So, some of the association we were seeing,

it was a less of an association before we adjusted.

It was the smaller decrease on the order of 3.3

percent per unit increase in HDL compared to the adjusted 4.2

percent because of the disproportionate percentage of females who have

higher HDL levels and higher being a female was associated with higher odds of obesity.

So, that was pulling down the unadjusted association for HDL slightly.

So, it's just interesting to see these things side-by-side and

given what we also looked in as a heads up.

They wouldn't necessarily do

that separate analysis of HDL by sex as part of the analysis,

but I thought it would be interesting to look at as

a precursor to comparing the unadjusted and adjusted results.

Let's go to the next table,

let's bring in H here even though it was represented in

the previous table than the unadjusted association,

we can see age was categorized into quartiles.

The reference was the youngest age group.

The relative odds for the second quartile being obesity.

The first was 1.7, for the third to the first is 1.84,

and the fourth to the first it was 1.37.

So, now, the consistent increase although

all three older age quartiles have higher odds than the younger.

So, not a strictly linear type of situation log odds scale.

But, they would pull out to a constant relative increase

or decrease on the odds ratio scale.

So, it probably is good that we modeled this as categorical,

but nevertheless there is an association.

Here's the overall p value for testing whether

the overall association between age and obesity statistically significant.

Clearly we see this already evidence that it is

in the fact that all of our confidence intervals for

comparing the odds of obesity for each of

the non-reference quartiles relative the reference are significant.

But, nevertheless this is the proper test result,

proper p value to report because as we noted before you can have

situations where none of the three non-reference groups differ from the reference.

But, some of these differ from each other and this p value will catch that.

We now look at model two,

where we've included sex, HDL, and age.

I'll let you go through this on your own,

but what we see in addition to what we've

learned from the unadjusted in model one is that,

things don't change much between model

one for sex and HDL after we've adjusted them for each other,

and when we bring in age to the equation as well.

The adjusted associations are similar to what they

were when only sex and HDL were in the model.

So, it doesn't look like these were further confounded by

age differences in those sex or HDL distributions.

Well, the estimates vary.

The confidence intervals overlap with

their unadjusted estimates in terms of the adjusted odds ratios for age.

So, it doesn't look like that changed much in the face of adjusting for both sex and HDL.

Age is still independently above and beyond sex and HDL,

a statistically significant predictor of obesity.

So, let's look at one more example,

predictors of breastfeeding in Nepalese children.

Random subset of data of children 12-36 months old.

Wanted to look at, what is the relationship between breastfeeding.

We started by looking at the relationship between breastfeeding and

sex and then went on to look at some other factors.

So, the unadjusted association,

you may recall, it was not that interesting.

There were log odds of being breastfed as a function of

sex where sex is one for females and zero for males,

resulted in an intercept of.85 in a slope per sex of negative.02.

We exponentiate that, we get an odds ratio

unadjusted of being breast fed for females to males of.98,

slightly lower odds in the females,

but it was not statistically significant,

especially as the odds ratio was very close to the null value one itself.

The unadjusted association in age however,

in this age group from 12 months to 36 months was not surprisingly negative.

The log odds of being breastfed decreased relatively sizeably per increased month of age.

So, the slope of negative.24 for age as interpreted as the log odds ratio,

estimating the log odds ratio of being breast

fed comparing two groups of children who differ by one month in age.

We exponentiate this, we get an odds ratio estimate of.79,

21 percent lower odds per increased month of age.

It's statistically significant as this was all in

the lecture on Simple Logistic Regression,

where we computed this confidence interval that went from.73 to.84.

Because, I'm just going to remind you,

we already investigated this unadjusted association,

the nature of it, in the lecture on Simple Logistic Regression.

But this is the LOWESS plot looking at the relationship between the log odds of a,

breast feeding and age to see whether it was consistent change,

and it is consistently decreasing,

and whether at least it could be roughly estimated by a line.

While we see some curvature here certainly,

we're not doing it a great injustice by fitting a line to that.

So, that's what we had done before and that's why we presented the results,

we just did with age being continuous.

So, we wanted to look at sex, age,

and we can bring in other characteristics of the children and the mothers,

so we might bring in the parity category of the mother.

I put this into four categories.

If this child we're studying here is their first child,

they had no previous children, that's the reference.

One previous child, two previous children,

and greater than two for the other three categories.

Also, parting the mother's age,

and of course that may be related to parity category, especially,

but nevertheless, we brought all these things in and look at the unadjusted results here.

So, we can see even though looks-like greater parity was associated with

lower odds of being breastfed consistently

although it wasn't quite a dose-response relationship,

looks like having more children was associated with lower odds across the board.

This construct was not

a statistically significant predictor of whether a child is breastfed,

nor was mother's age.

There was a slight reduction in the odds that

a child is breastfed with increasing years of mother's age,

but it's not statistically significant in the unadjusted sense.

Let's look at what happens when we look across various multiple regression models.

The first one here looks at the relationship between breastfeeding,

sex, and age taken together.

We can see that the differential in odds between

females and males gets larger with this estimated odds

ratio.76 and estimated 24 percent lower odds of being

breast fed for females compared to males

of the same age because now we're adjusting for age,

but it's nowhere near statistically significant that

confidence interval was all over the place and includes one.

Really doesn't look like there was much change in

our understanding of the relationship between

breastfeeding and sex after adjusting for age.

You notice the results for age are absolutely

identical to what they were in the unadjusted sense so,

there was certainly no confounding there by sex.

If we look at subsequent models here so,

I'll look at this second model brings in maternal parity,

on top of things,

and if you look carefully at sex and age,

they are very similar again to what they were in

the model where they only adjusted for each other and in the unadjusted associations.

While the estimated odds ratios vary a bit for the different parity categories,

the message is still the same amongst children of the same sex and age.

Odds in the sample of being breastfed went down with

increased parity of the mother and even though these estimates look dramatic,

there's a lot of uncertainty in each of them and

the overall construct is not statistically significant.

So, the resulting sex and age adjusted

association between breastfeeding parity is not statistically significant.

I just went ahead and put in maternal age as well in

this last model just to see if that changed anything or changed itself.

If you look through this,

the results for the other three are pretty much comparable to what

they had in the previous adjusted setup,

and parity was still not statistically significant,

nor was mother's age.

Also, reported the baseline odds here,

this would be the exponentiated intercept and that

would describe different groups in different situations.

In this first one where we only had sex and age in the model,

this would be on the estimated odds for male children who were newborns.

Well, that sounds like that might be relevant,

again our sample only included 12-month-olds

to 36-months-old so it doesn't quite cover a group in our sample.

But the reason these are so high,

and we'll talk about this when we get to the section on estimating

probabilities of the outcome for different x combinations,

is that the starting odds of being breastfed with low ages

in males is very high if we transform this into an estimated probability,

it would be very close to one.

Because remember, probabilities odds over one plus odds.

So, in summary, multiple logistic regression is a tool that relates the log odds

of a binary outcome y to multiple predictors x1 to xP,

generically speaking, via a linear equation of the form that says the log odds

that y equals one is a linear combination of our xs and also includes an intercept.

So, generically speaking, each slope beta hat I,

I equals one to p is the estimated log odds ratio of y

equals one for two groups who differ by one unit in the predictor xi,

adjusted for all other xs in the model.

These beta had Is, these slopes,

can be exponentiated to get adjusted odds ratios.

The intercept beta non-hat is

the estimated log odds for the group with all xs equal zero.

This may not be a relevant quantity,

depending on the predictor set x1 through xp,

but it is still necessary to specify the regression equation fully,

and can also be exponentiated to get what I'd call starting or reference odds for

the results what we build on to get the odds for other groups given their xs.

In subsequent sections, we'll show how to estimate the confidence intervals

for the various odds ratios we've presented here and in other situations as well,

how to estimate predicted probabilities of outcomes from multiple logistic regression,

and we'll talk a bit about what it would mean to

have good prediction from a multiple logistic regression model.