In this lesson, we are going to look at statistical quality control, and we're going to look at the different types of variability that can arise when one looks at measurements, we're going to look at statistical control charts or short control charts, and we are going to look at process capability analysis. To start out, let's think about a marksman. Now a marksman is someone with a gun who's capable of firing accurately at a target. So, let's start out by looking at this marksman's shot pattern after they've shot at the target and let's assume they have shot six bullets. So when we look at this marksman's shot pattern, what we notice is that there are six holes in the target, they're all within the bulls-eye, and it's a fairly tight grouping of shots. If we were to take, for example, the distance from the vertical to the shot and use that to come up with a probability distribution, we would get something that looks like the distribution below the chart now. This particular distribution, what we notice is that shots are more likely to be close to the center and less and less likely as one goes away from the center. Now, here's a different target that someone's been shooting at. When we look at this target, what we notice is that there are more shots and the shots are scattered a little bit more, several of them are in the bulls-eye but quite a few aren't in the bulls-eye. If you plotted the distance from this vertical line to any of the shots and came up with a probability distribution, that might look something like what's shown under the target itself. Notice the difference between the two distributions. They both seem to have a high probability of being in the center, but the second one is spread out a lot more because the shots are scattered a lot more. Now the question one has to ask is the second shot pattern that we are looking at, is that from a different marksman? A different person shooting at the target? Or is it possible that we could get this second pattern from the same marksman who shot the first one but for some reason it's an off day for the marksman? Perhaps the marksman has been out the previous night drinking on town and has a severe hangover while he's doing this. So the question we often want to ask is, if it's a different person who's created this pattern, then we cannot say very much about why there is this scatter of shots; it just could be that that person isn't that capable and therefore this is normal for them. On the other hand, if it's the same marksman and we see this character of shots, we have to ask ourselves a question, was there a special reason why this happened? So, what we are looking at then are evidence of a particular thing occurring and trying to figure out whether this thing could have happened in the normal course of things or whether there's a special assignable reason why this particular set of measurements was obtained. So this is the problem that we are looking at. Now why might the shot pattern for that marksman be different? There are several possibilities, one is material. For example, the bullets being used may not have been correctly packed or there might be deformities or grooves in that bullet that cause the bullets to stray from their intended course. It could be that there are variabilities in people and equipment. So for example, our marksman we said, might have had a hangover on that particular day and that's why there was a problem that arose. It could be that the equipment that the person was using may have, perhaps, some problems and is not able to produce accurately every time. Variability can occur in processes, variability can occur in the methods that are used to do a particular thing. People get fatigued over time and because of that, the way that they apply the method that they are supposed to apply starts changing and they start making mistakes thereby creating errors. The environment itself might be a problem. So for example, the marksman might be shooting on a windless day and for the second pattern, it might have been a very windy day causing the bullet to drift as the marksman shoots. Lastly, it's possible that whatever observations that we are looking at may have been incorrectly measured and so even if everything else was fine, the measurement, the process of measurement itself, introduces a source of variability in our measurements. So, there are several ways that variability creeps into measurements and processes and all of these different sources are then revealed in the form of scatter in the measurements that we obtain. Walter Shewart, when he thought about this, he decided that he could categorize variability into two kinds of variables. The first one he called common cause, and this variability is variability where the process itself has a certain limitation and because of the limitations of the process, you don't always get exactly the same value from the measurement of whatever's being produced by this process, and this he called common cause or controlled variation. This is what one expects on a day-in, day-out basis from this process or from this person. The second one that he categorizes is what he calls the special cause. These are the things that are unexpected. These are because something has changed which causes things now to be different. So for example, our marksman had a hangover; that caused the measurements to be very different. So now we have a special reason to which we can attribute the changes that we're observing. Usually, special causes once they are identified, one can look for corrective action to try and make sure that it doesn't happen again. Common causes on the other hand, are part of the process and until the process itself is made more capable, this is what's going to happen. So if we go back to our marksman example, if you give a novice a gun and ask them to shoot, they're going to get a wide scatter of points on the target. In fact, sometimes you are even lucky to hit the target. Now if you don't do anything, the novice will continue to perform poorly and you're going to get a wide scatter. However, if we now train the novice so that they become an expert, we will be able to control the scatter of shots that we see on the target. So, common cause is where we are looking at the inherent capability of the person or process and special cause is where we identify unexpected changes that may have occurred that could lead to changes in the results that we are observing. So, these are the common cause and the special cause. Now, how do we decide whether what we are observing is a common cause variation or whether it's a special cause variation? So the Shewart control charts, or the statistical control charts, are a way to discriminate between these two types of variability: common cause variability and special cause variability. So, what we try to figure out, then, is we ask ourselves what is the inherent capability of this process? What is this process really capable of? And then once we know what the process is capable of, how do we look for anything that shows deviation from the inherent capability of the process so that we can then look at this deviation that we observe and then look for assignable special causes for this particular deviation? So the Shewart control charts make an underlying assumption about the way data is distributed. The underlying assumption is that data is normally distributed, and so if you look at the normal probability distribution, the normal probability distribution is specified by N for normal and it has two parameters: the mean, given by Mu, by the Greek letter Mu, and the variance, given by the Greek letter Sigma to the power two or Sigma squared. The standard deviation is then just Sigma or the square root of variance. If you look at the normal distribution, what we notice is that if we look at the mean, the mean is the center of this distribution and if you move away two standard deviations to the left, and two standard deviations to the right, then 95% of the total probability lies within those two limits. So, the mean plus or minus two Sigma has 95% of the probability of the entire probability, which is 100% obviously. If we go from the mean three Sigma to the left and three Sigma to the right, so we have their mean plus or minus three Sigma, then that encompasses 99.7% of the probability. So, if I'm looking at observations from whatever quality experiment that I'm doing, and if I find values that are outside the mean plus or minus three Sigma limits, then that tells me that the chance of that happening is very very small of getting a value that is outside those limits is very small. In fact it is 0.3%. So, now if the chance is so small, I could look at this value that I've obtained that's outside the limits, and be a little suspicious of it and say, “Maybe this value that I have obtained that is outside this mean plus or minus three Sigma limits is not something that is very likely and so maybe there is some special reason why I'm getting this value,” and that's how I look for outliers. That's how I try to figure out why is it that I am getting a value that is outside those limits? So, a generic control chart works as follows. We now notice that we are taking our normal distribution and turning it 90 degrees to make a point. We still have the same plus or minus three Sigma limits that we are interested in so that we have 99.7% of the probability within those limits. So, what a control chart does is it essentially it plots observations in some sort of order, some chronological order as they are taken in time, those observations are plotted on this chart. This chart has three horizontal lines. There is the center line which is called the process average which is the mean of the normal distribution that we are looking at, and then we have two control limits. We call them the lower control limit and the upper control limit. The lower control limit is the process average or mean minus three times the standard deviation and the upper control limit is the mean or process average plus three standard deviations. Now, when we plot these different observations, we notice that most of them should lie within the upper and lower control limit. Occasionally, we will find a point and, hopefully occasionally, we will find a point that is outside the limits. Now remember, the chance of a point falling outside those limits is very small, 0.3%. So, the fact that we found such a point makes us think that maybe we should investigate this point a little bit more. Figure out why might we have gotten a point, a value that is outside of those limits? And then that's what we call an outlier and that outlier is a signal that we should go and look for an assignable cause for a value that is outside that limit. Whenever we create such control charts, there are a few things that we need to understand. First, we need to understand what it is that we are measuring because the type of control chart that we will use and the assumption underlying that control chart that we are using will change based on the type, based on what we are measuring. Now, underlying our control charts we said was a normal distribution, so how do we ensure that using a normal distribution is the appropriate thing to do for whatever it is that we are measuring? Then we have to figure out how to set the limits, the upper control limit and lower control limit, and then finally we have to recognize special cause or out of control situations. Now, if you want to think of what we are measuring, let's go back to the marksman’s example. Suppose our marksman fires six shots at the target at a time, we may be interested, for example, in the distance from the center of the target to the particular shot in which case we are measuring that actual distance and so we have a measurement variable. But then we may also be interested in looking at the percent of shots that are within the bulls-eye, or the black part, of the target. So we might, in this particular case, we find that we have five out of six are within the bulls eye and so we might be interested in this percent. So, there are from the same experiment, there are variety of different things that we could be measuring. We could measure how many of these shots are within this center part, the part which has a number 10 written on it, and figure out how many are within that target. So, what we measure will depend on our circumstances and our interests, but notice that there are two different kinds one where we are measuring a distance where there is a measurement of a specific physical attribute and a second one which is looking at categorizing whether the shot was in some region and or whether it was outside a region and looking at the fraction or percent of shots that satisfy whatever characteristic we have decided to use.