0:00

A natural follow up to the Bernoulli distribution is what is called

the binomial distribution, and the binomial distribution happens

when you perform a set of Bernoulli trials.

So we start up with our basic Bernoulli, and we repeat it.

And we say how many successes did we get in those trials.

We often give the number of trials the letter little n.

Now, an example of a binomial would be to toss a coin ten times and

count the number of heads.

Each of the tosses is a Bernoulli experiment and

if we perform that Bernoulli experiment ten times and add up

the number of successes then we have what is called a binomial random variable.

And these binomials are commonly used, modeling distributions.

Now, there's one subtly here about the binomial random variable is that

the trials need to be what are termed independent events.

I want to briefly talk about what we mean by independence.

Independence is a basic modeling assumption in many probabilistic models.

And formally, what independence means is that the probability that an event A and

B happens can be written as the probability that

A happens times the probability that B happens.

And just going back to thinking of tossing a coin twice,

what's the probability I get two heads?

Well, given the coin tosses are independent, that's the probability

the first is a head times the probability that the second is a head.

And it's when you put in the word times there that you're using this independence

belief.

And so an independence assumption certainly makes

probability models more straightforward,

because we can essentially calculate probability though multiplication.

Another way of understanding independence is that knowing the one random variable

has occurred, that A has occurred, that occurrence provides no information about.

A subsequent random variable, which we'll call b.

So going back to the coin tossing, we're going to toss the coin twice.

If I tell you that I got a head the first time around how much

does knowing I got a head on the first coin toss tell me

about what's going to happen on the second coin toss?

The answer is absolute nothing because the events are independent.

So that's the nice way of thinking about whether two events are independent.

Does knowing the outcome of one impact the probability of a second event?

If it doesn't, then you can view the events as independent.

And that is often a simplifying assumption that has provided for

lots of probability models.

Now, it doesn't mean that it's true, though, and

it's something that the modeller would need to think about carefully,

maybe look at some data to see whether or not the independent's assumption was

realistic, but just note that in this binomial distribution,

we're taking these Bernoulli building blocks.

We're going to replicate, repeat that Bernoulli experiment, and

we're going to assume that the outcome from one to the other are independent.

So that's a binomial.

Number of successes in n independent Bernoulli trials.

So my example that I'm talking about here.

Is toss a fair coin ten times count the number of heads.

That would be, a binomial.

Going back to the drug development, if we had ten drugs under development, and

we believed that, whether or not they we're approved,

what independent event, and they all have the same probability

of being approved under all of those conditions.

Then the number of approvals that we're going to have could be modeled as

a binomial random variable.

But we'd need two things going on there,

one, independence of the decisions as to whether the drug was approved or not, and

two, it's the same probability for each of these Bernoulli events.

So whether or not that's the case, one would have to examine carefully.

But if it were, you'd be looking at a Bernoulli add up, but if it were,

you'd be looking at a binomial for the number of successes in those ten drugs.

4:11

Now, in terms of notation we often write the number of trials

as the letter n, so in this instance it's ten.

And the probability of success as a p, in this instance if it's tossing a fair coin,

five.

Now I've written down the equation that allows you to calculate the probability

that this binomial random variable comes out to any particular outcome, and

we write that generically is the probability that big X equals little x.

Remember big X is the random variable little x is some particular outcome.

The formula that you need is n choose x, that

first object inside the big parentheses is known as a binomial coefficient.

And I've defined it underneath.

It's defined as n factorial.

Over X factorial times N minus X factorial.

What's a factorial?

Four factorial for example,

is four times three times two times one, that's the idea of a factorial.

Gaining practice, should never calculate these things by hand,

you'd always do them on a calculator computer or spreadsheet.

But continuing through the discussion of the formula for the probabilities for

the binomial.

It starts off with n choose x.

Then you have P to the power x, you've got x successes, and

then you're going to have n minus x failures.

And each one of those has probability 1 minus P.

And you can multiply everything together there because we have

assumed independence.

5:34

So that's what the formula looks like.

So at least in theory you could calculate any particular probability.

So I could say, what's the probability that you got 7 heads, for

example, in 10 coin tosses.

Now going to our summaries, remember the reason why we want to create these

summaries is that it's a sort of elegant way of telling people about the outcome

of our experiment, in the sense of when we have a probability distribution,

we like to summarize that probability distribution.

And so the mean of the binomial is n times p.

Remember p was the mean of the Bernoulli, and we did this n times, and

we got n times p.

And the variance is n time p times one minus p.

So that would mean that, I don't have it on the slide but

the standard deviation would be the square root of n times p times one minus p, so

those are the mean and the variance of a binomial.

7:24

In the right hand panel here, I'm looking at a different binomial.

This one where we have n equal to 20 trials.

But the probability of success in each one is 0.75.

And so, the expected value of that binomial would be n times p which is 15.

And you can see how the probabilities break out around that 15.

So then I plotted these graphs simply by calculating

the binomial probabilities from the formula on the previous slide, so you get

a sense of the shape of the probabilities associated with the binomial distribution.

It's going to depend on the number of trials and p, the probability of success.

But this binomial distribution is certainly one of

the building blocks that many probability models.

Now that I've introduced the binomial distribution I wanted to take a moment to

provide you with an example.

And the example that I am using is, in fact, used in practice.

So it's binomial model for market.

Right at the beginning of this module I talked about how if you're an energy

intensive company the price of oil would be very important to you.

And so here's a potential model that someone might use to

8:56

We'll say that it goes up by u% with a probability p,

and down by d% with a probability 1 minus p.

Something has to happen, p and 1 minus p add up to 1.

We're also going to assume that day to day moves are independent of

one another which was one of the assumptions

that we need to generate one of these binomial random varimals.

So, to put some numbers in, to make it a bit more specific,

let's say the probability of an up day is 55%, 0.55.

That means that the probability of down day must be 0.45.

And let's say when the market goes up, it goes up by a quarter of a percent.

When it goes down, it goes down by 0.22 of a percent.

And I've just provided these numbers to illustrate the idea of

the binomial market model.

In practice you might estimate these from some historical data.

9:50

Now I'm going to illustrate with a time horizon of three days.

So I'm going to imagine three consecutive days in the market.

And if we have three consecutive days then they're going to be eight

possible outcomes in terms of ups and downs over those three days.

Now you might think at first that there would be six, two outcomes and three days,

two times three equals six.

But that's not how you count potential outcomes.

It's two to the power three which is eight outcomes, because on the first day

you can be up or down on the second, you can be up or down on the third up or down.

And so all those combinations give you eight potential outcomes.

And I've listed them here.

You can be up up up, up up down, up down up,

up down down, down up up, down up down, down down up,

or down down down, a really bad three day sequence at the end there.

So those are all the possible outcomes.

Now for each of those possible outcomes, there's going to be a change in the market

depending on the number of ups and downs we had.

So I want to take on of these potential outcomes,

let's say the up, up, up outcome, and

work out after those three days how much the market would have changed by.

So on the first day, if we're up, we go up by a quarter of a percent, and

the second day a quarter of a percent, and the third day a quarter of a percent.

Going up by a quarter percent means multiplying by 1.0025.

So we have three up days, we multiply those together and you'd get 1.007519.

And that says that after those three days the market will have gone up by.

A little more than three quarters of a percent.

So for any particular combinations of outcomes,

we can work out how much the market is moved by simply by

multiplying the appropriate factors together.

And getting this illustration, is the three updates.

Now as I pointed out, there are eight potential outcomes and here they are.

So, we've talked about tree representations of a set of events,

a sequence of events.

And this is a very natural way to

illustrate the potential moves of this market that we're thinking about.

It would be quite straightforward to implement

within a spreadsheet environment.

So the way that we think about this is that there's a current price.

That's the root of the tree.

And then after the first day, we can either be up or down.

And then the second day up or down, and then third day up or down, and

you can see how the tree allows us to list out the eight potential outcomes.

And if we follow the top most branch of the tree that's the up up up event.

12:29

Now the chances of that happening, remember.

The probability in this model of an up day is .55 or 55%, and

would need working out of the branch at the very top here of the tree.

That is up, up, up!

So given the independence assumption, we multiply probabilities together.

And so the probability that we have the three days in a row is .55 times .55,

times .55.

Which comes out to 0.166, which is about one sixth.

So that's the chances that we get three up days in a row.

Now, if we have those three up days,

we can also work out how much the market is moved by.

Remember, when it goes up according to this model it goes up by a quarter of

percent and there you can see I've written in what the overall movies over

this three days, it's 0.75%.

The three quarters of percent I talked about on the previous slide.

And I can do that calculation going along any set of branches at this tree, so

we follow the second one which is up,

up, down, then the probability of that happening is 0.55 times 0.55.

Those are the two ups.

And then we go down, times 0.45.

And the probability of that particular sequence, up, up and down is 0.136.

And if we get that sequence of up and down days, then we can multiply the appropriate

percentages together to get an overall move of point two eight percent,

as I've written in the second sub at the end of three.

13:56

Now you'll notice if you go down through those eight potential outcomes,

a number of them have exactly the same overall probability calculation, and

move calculation.

That's because, the chances are going up up down are the same as going up

down up, which is the same as going down up up.

And so, that's where this binomial set of probabilities are coming in.

There are actually three ways you can move through the tree.

And get two ups and a down.

That's the binomial coefficient that we had seen when I had actually written down

the probability formula for calculating the probability of any particular

number of successes in these trials.

And so what's the probability of getting two ups and a down?

As I said that's up, up, down, up, down, up, or down, up, up.

And we could find the three cells at the end of the tree here and

add up those probabilities.

It's 3 x .136 to get the probability of two ups and a down.

And it, the three factor, that there are three cells there that will code,

two ups and a down is the binomial coefficient.

Formerly, you're going to choose two ups, two successes out of the three binomial

trials, because we're looking three days in ahead and so

that would be in terms of binomial coefficient three factorial

over two factorial times one factorial, which is six over two, which is three.

The three possible ways of moving through the tree.

Likewise the chances of two downs and

an up can be calculated in a similar fashion.

And the worst sequence of events, the down, down, down that is the bottom

cell that you can see, the lowest set of branches on the tree.

And the probability that you get three down days is .45 times .45 times .45 which

comes out to be .091, about a tenth as opposed to a sixth for the three updates.

And those probabilities are different because we have a different probability of

going up then we do of going down.

and if we did have three down days in a row we can work out

the move that the market makes, and

it will fall by 0.66 of a percent if we have three down days in a row.

So here's an illustration of the binomial