0:00

In this video, we will define what we mean by independent events,

learn ways of assessing independence, and

introduce the multiplication rule for independent events.

0:10

Two processes are said to be independent if knowing the outcome

of one provides no useful information about the outcome of the other.

For example, knowing that the coin landed on a head

on the first toss, does not provide any useful information for

determining what the coin will land on in the second toss.

The probability of a head or a tail on the second

toss is .5, regardless of the outcome of the first toss.

Therefore, outcomes of two coin tosses are said to be independent.

0:41

On the other hand, knowing that the first card drawn from a deck

is an ace does provide useful in,

useful information for calculating the probabilities of

outcomes in the second draw.

This is for drawing the cards without replacement, in other words

not putting the cards back into the deck after we draw them.

For example probability of drawing yet another ace is going to be 3 over 51.

We have 51 cards left in the deck, and only three of them are aces.

While the probability of drawing a jack is going to

be 4 over 51, since we all, still have four

jacks left in the deck.

Therefore, outcomes of two draws from a

deck of cards, without replacement are dependent.

1:24

Based on this definition, we can develop a

general rule for checking for independence between random processes.

If the probability of an event A occurring, given that event B occurred

is the same as the probability of event A occurring in the first place, then events

A and B are said to be independent.

This rule basically says that knowing B tells us nothing about A.

Note that we use this vertical line notation to mean given.

Meaning the probability of A given B. So let's put that rule to use real quick.

2:00

In 2013 Survey USA interviewed a random sample of 500 North Carolina residents,

asking them whether they think widespread gun ownership protects

law abiding citizens from crime, or make society more dangerous.

58% of all respondents said it protects citizens.

67% of white respondents, 28% of black respondents, and 64% of

Hispanic respondents shared this view. Based on these we want to fill in the

blank in the following sentence.

Opinion on gun ownership and race ethnicity

are most likely, which of the following?

Complementary, mutually exclusive, independent, dependent, or disjoint.

These should all be terms that you're familiar with by now.

Let's take a look at what we're given.

We're given that the probability that a

randomly chosen resident believes that guns protect citizens

is 0.58.

We also know that if the resident is white, then this probability is 0.67.

Once again, we use this vertical line notation to say the probability

that somebody believes that guns protect citizens, given that they're white.

And that probability is 0.67.

If they're black, the probability is 0.28, and lastly, if the

resident is Hispanic, the probability that they

believe that guns protect citizens is 0.64.

Since the probabilities of thinking that guns

protect citizens vary greatly based on the

person's race or ethnicity, Opinion on gun

ownership and race ethnicity are most likely dependent.

So knowing somebody's ethnicity might actually give us useful

information about their opinion on guns, and therefore, we are saying

that the two variables are most likely dependent on each other.

3:56

We've been using wording like most likely

dependent since we're working with sample data.

And we're not yet using statistical inference tools

that allow us to take the results that

we get from my sa, from our sample and expand that to the population at large.

4:13

If we observe a difference between the conditional probabilities that we

calculate based on the sample, we say that these data suggest dependence.

The next natural step would then be to actual conduct a hypothesis test.

To see if what we observe these difference that we observed,

could have just happened due to chance or natural random sampling.

4:33

Or, if there's actually a real difference in the population.

We've done a little bit of that at

the end of the last unit, and we're going to get

back to doing that in the next unit as well.

But for now we're kind of picking up building blocks to get us there.

However, before we get there, we can

actually do a little bit of speculating based

on the magnitude of the differences that we observe as well as the sample size.

For example, if the observed differences between the conditional

probabilities, this is kind of like the probabilities we were

just looking at.

Probability that guns protect citizens, given that somebody's

white versus, given that they're black versus given

that they're Hispanic, if these conditional probabilities varied

greatly, in other words the differences are large.

Then there is stronger evidence that the difference is real.

That we would see something similar to that, had

we had data from the entire population as well.

5:39

Now that we know how to check for independence, let's see what

we can do with events once we find out they're, that they're independent.

The product rule for independent events says that if A

and B are independent, then the probability of A and

B happening is simply the product of their probabilities.

5:58

Say you coss, toss a coin twice.

What is the probability of getting two tails in a row?

Sounds pretty simple eh?

The probability of two tails in a row is simply going to be the probability

of a tail on the first toss times the probability of a tail on the second toss.

We've seen before we've talked about before that coin tosses are independent

of each other.

Therefore we're ab, we are able to apply this rule that we've just learned.

6:25

Probability of tail on either toss is simply 0.5 or 1 over 2.

So the overall probability is going to be a quarter or about 25%.

A quick note, this rule isn't really limited to just two events.

And it can actually be expanded to as many independent events as you need.

So if,

instead of doing two coin tosses, we had a hundred of them.

We could simply multiply a hundred of the same probabilities together.

Generically said, if A1, A2 all the way through

Ak are independent, then probability of all of these

events happening at once is simply going to be

the product of the individual probabilities of the events.

7:25

Assuming that the obesity rates stay constant, what is the

probability that two randomly selected West Virginians are both obese?

We're given that 33.5 % of West Virginians are obese which

we can denote as probability of being obese as 0.335.

It's often useful to make lists of the givens and the

problem, as we have been doing in the past couple examples.

This helps to keep everything neat and organized and then it help, makes it

easier for you to refer back to these values when you need them later in

your calculations.

8:00

We're told that the two individuals are randomly selected.

Which means that they're going to be independent of

each other which, with respect to their obesity status.

For example, if we pick two people from the same household

and one is obese, the other one might be more likely

to be obese as well, given that people who live in

the same household are more likely to have shared eating and

exercising habits.

However, since we're randomly selecting these

individuals, we can say that they're independent.

And since the two are independent, the probability of both of

them being obese will simply be the probability, will simply be

the probability of the first one being obese times the probability

of the second one being obese, each of which is 0.335.

Resulting in an 11% chance of two randomly selected West Virginians being obese.

This value, 11% of the probability of both of these people being

obese, is less than the probability of either of them being obese.

Which makes sense.

For two reasons.

Mathematically speaking, we're multiplying two values between zero and one.

So the product

will necessarily be a value lower than either one of them.

And conceptually we want to find two people

that fit a certain criterion, at the same time.

Therefore, the likelihood of us getting what we want should be lower than

the likelihood of getting just, finding just one person who fits that criterion.

Reasoning through the final numerical answer this way is often useful.

It helps us, really understand why the formulas that we're using

work the way they do without getting in to theoretical proofs.

And it's also useful for checking the final numerical answer

in the context of the data that you're working with.

In other words, it's really a good way to check your work.