We may overestimate what happens in early bilirubin levels and underestimate,

and later, by fitting a line to it,

but we wouldn't miss the overall gestalt which is that there is

a consistent increase in the log hazard of bilirubin after adjusting for treatment,

age, and sex with increasing levels of bilirubin.

So, if I were working with an expert,

a hepatologist or somebody else,

I would seek their expert opinion on which one was

more biologically relevant to model these,

but certainly, by including bilirubin's continuous,

it may not be optimal but it certainly is not going to miss

the big idea of the log hazard increases substantially with increasing bilirubin.

So, how given data,

if you were to work on a data analytic team or analyze your own data,

and you got a dataset, how would you choose with the time to event outcome?

What your final multiple regression model would be?

Would you keep all X's in?

You first have to come with a working hypothesis or question.

We want to see how these 10 things are related to survival, for example.

But in many cases,

the idea of feeling can be a little overwhelming to start.

I have all these possible predictors,

there may be some confounding,

I can look at that, et cetera,

but what constitutes the best final regression model?

That really depends like with the other types of regression on the goals of the research.

So, if the goal is to maximize the precision of the adjusted estimates,

being adjusted log hazard ratios and ultimately the adjusted hazard ratios,

it makes sense to keep only those predictors that

are statistically significant in the final model,

so that you don't have to estimate slopes for things that don't add

knowledge to the outcome after counting for the other predictors in the model.

That would compromise the precision of the predictors that are

associated because we'd be estimating more things with the same amount of data,

where some of these things don't add back information.

If the goal is to present results comparable to the results of

similar analyses presented by other researchers on similar or different populations,

they want to present at least one model that includes

the same predictor set as the other research,

even if some of the predictors that they

used are not statistically significant with your data.

So, if the goal is to show what happens to the magnitude and

association with different levels or adjustment,

you may present the results from several methods that include

different subsets or combinations of adjustment variables.

So, with the first column would have all the unadjusted results.

The second column might be adjusted for demographic characteristics.

The third maybe adjusted for sociological characteristics.

The fourth might be adjusted for biological characteristics.

Then, the fifth column may have all adjustment factors in one model,

and so that you could list,

the reader could look at the association

between the survival outcome in any specific predictor,

as what happens to it after comparing the unadjusted to the different adjusted results.

Might have some sense of what types of factors are confounding that relationship if any.

But the goal of prediction, well,

it's slightly more complicated story,

and we can't give it a full treatise in this course,

but we will discuss briefly the ideas of prediction and how to

estimate survival curves from Cox regression results though.

So, we did this and we talked about this for simple Cox regression,

will just extend the idea in a multiple Cox.

We can, but use the results of the regression to get estimated

cumulative survival curves for a given set of x values used in the regression.

How to do this is mathematically involved.

But what you could do for display purposes is use the computer to estimate

these curves and then show different curves for

several specific value sets of x1 through xp,

and I'll show some examples of that.

So, how would you get the estimated cumulative survival?

Here's the math part, if you're not familiar with calculus,

we're not in the math, don't worry about it,

I'll talk about it conceptually.

But I do want to show the math for those who're

interested and may do further statistical courses.

So, how would you do that given a set of predicted values x1 through xp?

Well, you have this equation that estimates the log hazard of the outcome occurring at

any given time in the follow-up period that it is a function

of the predictors x1 through p and time.

So, what we need to get or the computer would do for

us is it would fill in at any given time,

the intercept piece here,

and then this could be added to

the linear combinations of slopes times their specific predictor values,

and this would turn out to be a number here,

and this would be the estimated log hazard at

that given point in time for the group defined by their x values.

Then this could be translated into an estimated hazard

at that time for the group given their x values.

Then what will happen just like we did with simple Cox regression is that,

in order to estimate the survival curves,

survivals dependent on the cumulative hazard or risk

incurred in that group up through the time we're looking at.

So, if we're looking at 30 weeks,

the cumulative hazard will involve the risk from time zero up

to 30 weeks for that group who's defined by this x values.

This is found by integrating this function over time.

In integration you can think of it's just summing up

these time-specific hazards from time zero up to

the current time that's being assessed and getting

the cumulative risk incurred by this group up through the time,

this has been computed for.

Then the survival at that time is

a function of the cumulative hazard as

exponentiated version of the negative of that cumulative hazard.

So, this is just FYI,

but if you don't follow the math,

that's fine, but just think of this conceptually,

the more risky a group has incurred up through a given time,

the lower their survival beyond that time.

So, more risk equals lower survival.

So, let's look at some predicted survival curves based on

these Cox regression results for the PBC study.

Again, we have a model here,

if this is the multiple model,

includes treatment, age, bilirubin and sex.

I using the computer could create

some estimated survival curves for different subgroups based on their x values.

So, I can't show

all possible survival curves for all possible unique combinations x values,

but we could look at this.

This is something that might be presented in the paper to

show what the impact of bilirubin,

for example, is among different age groups on survival.

So side by side here,

I have age-specific survival curves.

This was the age at the start of the study for a group of

female patients with low bilirubin who were on the drug group DPCA,

and this side by side is a group of

female patients with high bilirubin on the same drugs.

So we show this separately by age group,

both groups are female,

both groups from the same treatment.

So, you can see that we noticed that

the hazard ratio associated with bilirubin was large,

a 16 percent increase in the relative mortality,

relative hazard of death per one milligram per deciliter increase.

What I did here was for display purposes,

took bilirubin and put it into quartiles and then showed what the impact would be

on these specific quartiles given the numerical values for them.

So, you can see that these patients on the certainly age is in both cases,

the larger the age the worst of survival.

But for comparable age groups with low bilirubin versus high,

you can see the survival is much better for

the low bilirubin groups than the high bilirubin groups.

In the worst case scenario,

for the oldest age group and with

low bilirubin survival after 12 years is on the order of 60 percent,

60 percent make it beyond 12 years,

I compare that to almost nobody

surviving beyond 12 years in the high bilirubin at baseline group.

Let's look at another example of using

Cox regression results to present predicted survival curves.

This is from predictors of infant mortality including gestational age,

whether the mother was given,

in this randomized study,

beta carotene vitamin a or placebo,

the sex of the child and maternal parity.

I'm going to use the results from model three,

includes all four predictors to look at

some predicted survival curves for different subgroups of children.

So, here's some examples of estimated survival curves

based on their multiple Cox regression results from module three.

So, we can see these are split out by gestational age groups,

we're looking at males with two to four older siblings

whose mothers were randomized with the placebo arm in

this left graphic and in the right graph,

we're looking at female infants with two to four older siblings

whose mothers were randomized with the placebo arm.

We know that sex had very little to do with

survival either before or after adjusting for these other things.

So we can see very clearly,

again, that gestational age,

that your drive survival and that being preterm was a huge risk factor,

but this puts up absolute percentage on

the risk as opposed to just the

relative that we get from the hazard ratios from the model.

So, it's nice sometimes to display

the predicted survival curves for some of the subgroups involved to get a sense of

the absolute nature and magnitude of the risk of

the outcome above and beyond on what we have from the relative hazard ratio comparisons.

I had mentioned before that there is uncertainty

in linear except estimate as a function of time.

Confidence intervals for these curves can be created,

so I'm showing here are just two of the gestational age groups as

opposed to all five on one graph because it gets messy with the confidence intervals,

so I'm looking at those who were 36 to 38 weeks and those that were less than 36 weeks.

Here, are their estimated survival curves and the confidence intervals.

The uncertainty in these estimates

comes from the uncertainty in the intercept for the model,

as well as, the uncertainty in

the slopes for the factors that are used to estimate these.

All that uncertainty gets transformed when we transform

back from the log hazard scale to the survival scale,

it's complex, but confidence intervals for these curves can be created.

Just thinking about using the resulting curves for prediction to predict,

for example, survival for infants who were not enrolled in the study and the triage them.

To get a sense of how well these curves

predict just like we've seen for the other types of

regression are the measures of model prediction evaluated

using the same data used to fit the model were overly optimistic.

There are ways that we can't get into here to assess how

well these curves predict for the given dataset.

But in order to do it properly,

we wanted to do some calibration where we either use

another set of comparable data and see how

well the model will fit on this particular set,

predicts for the other set of data or in the absence of having that,

if there's enough data and the original data we have for mole,

and you could split the data randomly into two subsets and use one of

the subsets to fit the Cox model and the other to evaluate it's predictive power.

Again, we don't show how to evaluate predictive power of

Cox model because it's beyond the scope of what we can do in terms of this course,

but just again, this principle of not using the data that was used to create the model,

to validate the model and makes for a better assessment.

So in summary, what we've talked about here is it start when

including a potential predictor in the Cox model that's continuous,

there's an underlying assumption in the model,

underlying linear assumption involved

and this is that the relationship between the log hazard in

the continuous predictors linear nature after adjusting for

other predictors in the model and we showed how to investigate that empirically.

We also showed that the results from multiple Cox regression can be used to

produce estimated survival curves for groups given a specific set of predictor values.

We reinforce the idea that if one wants to build a predictive model and

evaluate how well it predicts for the population from which the sample came,

it's best to validate the prediction on

another data set from the same population that was not used to fit the model.