So, let's look at some examples of

the use of logistic regression the public health and medical literature.

This will give you an opportunity to interpret the results from

simple and multiple logistic regression models presented in at least three,

because we'll look at three examples here of a published journal articles.

So, the first one we'll start with is more of a clinical article from JAMA Surgery.

But one of the reasons I chose it is because I

liked the way they described doing their methods.

So, let's just give the context for this from the abstract.

The importance, they say,

of this study is despite the increasing use of anti-tumor necrosis factor,

TNF, therapy in ulcerative colitis,

its effect on postoperative outcomes remain unclear with many patients requiring

surgical intervention despite optimal medical management.

So, their objective here is to assess the association,

a preoperative use of anti-TNF agents with adverse postoperative outcomes.

So, what they used was insurance claims data from

a large national database to identify

patients 18 years or older who had ulcerative colitis.

These insured patients had inpatient and/or outpatient claims between January 1st,

2005 and December 31st,

2013 with current procedural terminology codes for,

and they had three different types of surgery or

subtotal colectomy or total abdominal colectomy,

a total proctocolectomy with end ileostomy,

or a combined total proctocolectomy and ileal pouch-anal anastomosis.

So, these are three different groups.

Only data regarding the first or index surgical admission

within the time frame were extracted.

Use of anti-TNF agents, corticosteroids,

and immunomodulators within 90 days of surgery

were identified using Healthcare Common Procedure Coding Systems.

So, what they were looking at as far as outcomes goes in

the first 90 days were 90-day complications from the surgery,

emergency department visits, and the readmission,

and what they used was multivariable logistic regression,

was used to model these covariates,

and multivariable logistic regression is a synonym for multiple logistic regression.

Their primary predictor was anti-TNF agent use,

but they included other things to control for

differences between those who receive this and those who didn't.

They may also be related to the outcomes as well.

They had almost 2,500 patients,

2,476, a little over half, 55.75% were men,

and the mean age was 42.1 years,

but there was some variability in the individual ages as

the standard deviation was 12.9 and among these groups,

about a little over a third, 950,

underwent subtotal colectomy or total abdominal collectomey,

354 of the 2,476 underwent total proctocolectomy with end ileostomy,

and another 47.3% received ileal pouch-anal anastomosis.

So, these are the three different groups we have here.

I'll come back to the results as we work our way

through the main sections of the article now.

So first, I did a nice job of describing their statistical approach.

So, they first said,

and this is very standard to talk about doing some unadjusted comparisons

between the exposed and unexposed groups of interest when there is

a single exposure of interest,

like in this case, anti-TNF agents.

So, they said we use Wilcoxon and chi-square test is

appropriate we're used to compare preoperative variables and

post-operative outcomes between patients receiving

and patients not receiving anti-TNF agents in each surgical group.

Wilcoxon test is analogous to a t-test comparing means,

instead of comparing a summary statistic and the population level, like means.

Between two groups, it compares the distribution,

but certainly, they could have used the two-sample unpaired t-test here as well,

and then the chi-square test for comparing

binary category outcomes between the two groups,

those who got the anti-TNF agents and those who didn't.

They go and say, multivariable,

which is synonym for multiple logistic regression,

was used to model the occurence of each outcome,

remember there are three outcomes,

within 90 days by the following covariates: anti-TNF agent use,

which was their primary predictor of interest, and age, sex,

co-morbidity index, malnutrition, failure to thrive, etc.

They go on to say a linear association for co-morbidity index was assumed.

What they mean there is on the log-odd scale in the multiple logistic regression.

They go on to say though that non-linear associations on

the log-odd scale for age were examined in these models.

They say using something called B-splines,

this is something very similar to low-S.

So, they looked at whether they allowed for the relationship on the log-odd scale between

the log odds of any of the outcomes and age to be

flexible and they found that even by doing that,

there was evidence of a linear fit.

So they said, but linear associations were found to be

appropriate in all cases and were used in the final models.

Odds ratios and 95% confidence intervals were reported for all factors in the models.

P less than 0.05 was considered statistically significant and 2-sided P values were used.

They go on to say, "Among this subset of patients who did receive anti-TNF therapy,

we examined whether outcomes different accordingly to time

between anti-TNF agent use in surgery."

They first use univariable or simple logistic regression to

model the occurrence of each outcome as a function of time since infusion.

Time was continuous, and they looked at whether

there was relationship between the log odds,

the outcome as a function of time was continuous or not,

they looked at the possibility of non-linearity using B-splines,

again, this is similar approach to low-S,

but all non-linear curves were nonsignificant.

So, there is a way to actually fit a more formal low-S

where you can test whether result in non-linear,

there's evidence in the fitted model of deviation from linearity,

and what they're saying is they looked at

this but all non-linear curves were not significant.

In other words, there was no advantage statistically

of assuming a non-linear relationship,

and so they ultimately went back and treated time as

a linear predictor in the log odds model.

The estimated probabilities in each outcome as

a function of time we're then constructed for these models.

Multivariable analysis could not be

performed because the event rates of each outcome were too

small in the group that solely received the anti-TNF agents.

In addition, they'd say sensitivity analysis that

excluded patients undergoing emergency surgery was performed,

given the inherent heterogeneity in disease severity

and operative complexity in this population.

All logistic models were refit after this exclusions and they looked at

the results and compared them to the models that included those patients.

But here's what I wanted to focus on in this section here,

is they say statistical analyses were conducted using SAS software version 9.4 and R,

R software version 3.1.2.

What they're showing here are only the adjusted odds ratios.

In this table, there's information elsewhere in

the article for some of these unadjusted associations,

if one want to look at confounding,

and let's just look at what they found with anti-TNF agent used as primary predictor.

They found that for all three outcomes,

any complications, emergency department visits and readmissions,

those who got the anti-TNF agents had

an estimated lower odds of having

the outcome but after accounting for sampling variability,

the resulting confidence intervals all included the null value of one,

and the resulting P values for testing the null that there was

no relationship between anti-TNF agents and these outcomes;

all P values were greater than 0.05.

In fact, across these they didn't find

many statistically significant predictors of the outcomes after accounting,

adjusting for each other in these models.

So, just to be clear,

what am I getting in at what this model, well,

let this model here comes from

a multiple logistic regression that starts as the log odds of

any complication equals some estimated intercept.

Notice they don't provide the intercepts here as well,

which means we could not, as the reader,

take these results and estimated the probabilities of

any complications for different groups given their predictor or be at

most of the predictors are not statistically associated with the outcome,

so that may not be that useful.

So, they have a slope x1,

which might be a one if they used anti-TNF agents.

A zero if they did not.

Then, we have a slope for age.

Notice how they coded that though,

this was in 10 year increments, but this is age.

I remember they talked about testing for nonlinearity and they

didn't find any reason not to assume linearity so they went with that.

Then we have another slope for sex,

which is a 1 for females and a 0 for males as per their designation here,

and so on, and so forth.

So, this is the basis for the models,

but again, if you look across this table,

there's very few things that they found were

statistically associated with either the outcome of complication.

This second model looked like the first one except they were estimating the log odds of

an emergency department visit within 90 days, post operative.

The third outcome, they looked at the log odds and

ultimately express things on the odds ratio scale in this table for readmission.

But they didn't find even these adjusted analysis and many predictors of these outcomes

for those who had subtotal colectomy or total abdominal colectomy.

So, let's see how they summarize this.

A total of 950 patients who underwent

subtotal colectomy or total abdominal colectomy procedure were identified,

of whom 254, 26.7 percent had claims for anti-TNF agent within 90 days of surgery.

Given it a mean of 39.1 days prior.

Patients receiving anti-TNF agents compared with those with

no anti-TNF agent use were significantly younger,

mean age 37.6 versus 42.4 years P less than 0.001.

So, this is a comparison of the mean age between these two groups.

An unadjusted comparison, and underwent fewer emergency surgical procedures 33,

13 percent versus 191,

27.4 percent, P less than 0.001,

but did not differ regarding sex,

comorbidity index, or malnutrition status.

Significantly, more patients receiving

anti-TNF therapy compared to those with no anti-TNF therapy use,

had corticosteroid use and immunomodulator use.

So, there were two more significant differences

between those who got anti-TNF agents and those who didn't.

They go on to then say,

"In univariate," in other words unadjusted or simple logistic regression.

Also, patients receiving anti-TNF agents,

compared with those with no anti-TNF agents had fewer ED visits within 90 days surgery.

The P value the 31.1 percent versus 38.8

percent in the group who didn't get anti-TNF agents and that's statistically significant,

but there were no differences between these two groups for readmissions or complications.

However, on multivariable analysis,

when they did the multiple logistic regression,

which we just looked at the results of,

the receipt of therapy was not significantly associated with these outcomes,

and that's what we were talking about in that last table.

They also did this for the other two types of surgery as well,

but I'll just focus on this, for now.

One thing they do say here and I'm going to come,

bring back something we've talked about, and they said,

"Amongst those who received the biologic agent,

amongst those with subtotal colectomy or

total abdominal colectomy who did receive an anti-TNF agent.

The timing of its most recent administration do not

influence the occurrence of any adverse outcomes within 90 days."

This is what they're talking about,

remember they talked about that secondary analysis where they

did a logistic regression of the log odds

of each of the outcomes on the time since getting the anti-TNF agent,

only the subset of patients who got the anti-TNF agents.

What they're showing here from that logistic regression,

relating the log odds linearly to the days

of most recent anti-TNF agent use in that 90 day period.

They're showing the predictive probabilities from

that logistic regression model and a confidence band around it,

so each point on here estimates the predicted proportion of complications in patients who

got subtotal or a total abdominal colectomy and were on

anti-TNF agents the proportion of complications.

At each day in the 30-day, 90-day follow-up period.

So, they are basically taking the results from their linear logistic regression model of

relating complications to time or day of most recent biological anti-TNF agent use,

and transforming those into predictive proportions or

probabilities and graphing that as a function of time.

Again, these results were not statistically significant, however.

They're showing in this portion of the graphic,

the outcome of total complications for the three different types of surgeries shown,

this is part of a larger graphic that went on to show it for the other two outcomes,

ED visits, and readmissions as well.

But I like this because they talked about investigating

the linearity assumption and their method section found that there was

no reason not to assume linearity and now they're presenting

the predicted probabilities from these regression models graphically.