Welcome back for part 5 of our notebook here on bagging. Here we're going to select one of our better-performing models, and then find each one of those classification metrics that we discussed earlier in the course. So the first thing that we do is take that random forest, and we set the parameters equal to number of estimators equal to 100. So that way you're working with 100 trees, which we saw performed very well for a random forest. We're then going to use that model to come up with the predictions for X_test. So either one or zero for churn versus not churn. Then we're also going to come up with the predicted probabilities. Now, this is going to be important because the predicted probabilities are going to be needed in order to calculate our ROC-AUC score, as we saw earlier. Now we didn't discuss much how probabilities are predicted for random force or even for decision trees for that matter. But it's very simple. The way that it works is for that decision tree. If we imagine a way at that bottom of the decision tree, at one of the leaves. We find our decision, so we decide churn or not churn. But in that leaf, let's say there were 10 different values, either churn or not churn. Out of those 10, nine have outcomes of churn, and one has an outcome of not churn. Then we will predict churn because it has more values, but it also has a 90 percent chance of being true because 9 out of those 10 values were true. So that's how we come up with the probability. That would be a 90 percent probability for the decision tree. When we extend that to random forests, all we do is take an average of each one of those decision trees for that row. Where it would end up for that leaf could be 70 percent in one, 90 percent on the other, 80 percent in the third, and we'd end up with an average of that 70, 80, and 90, and end up with an average of 80 percent for that predicted probability. So you run this, and then just to take a look at y_prob, we see that's going to output the probability of either one of the classes. So class 0 and of class 1 on the second column. When we actually use our ROC-AUC, we only want the probability for that positive class, and we'll see that come into play in just a second. So we import for metrics are classification_report, our accuracy_score, precision_score, recall_score, f1_score, and our roc_auc_score. So the first thing that we're going to do is plot that classification_report, and as we saw earlier in earlier notebooks, the classification_report will give us each one of these scores for both the negative and positive class. So it'll give us that accuracy, precision, pretty much all the scores we have here except for the roc_auc_score. Then we're going to also, just to ensure that we got it all corrects, pull out each one of these values: accuracy, precision, recall, f1 for that positive class, as well as the roc_auc_score, which will have to use the probability for. Here, as we discussed just a second ago, we're going to take all of the rows, but only that second column. So I'm going to run this, and we see that we get for our negative class, which has a lot more values, we see the higher support, fairly high scores. Then for the positive class, which is that they did churn. We didn't do quite as well, there's a smaller fraction. We did okay, in terms of what we call predicting the actual values better than just a coin flip, but not great. Then we see that each one of these will match up with the scores that we have for our positive class. Then you see that we are also able to get the AUC score, which was fairly high as well. Again, for that positive class. Well, that's for positive or negative class. Now for part 6, we're going to visualize our confusion matrix, plot out the ROC-AUC, as well as the precision-recall curves. Then we're also going to plot our feature importances. So we'll see how those feature importances come out as we go through and see. Hopefully, they'll be in relation to what we saw before in regards to the correlations between our churn value and each one of those different columns. So we're going to create our confusion_matrix using our y test and our predicted values. We're then going to initiate our bounding box. We're going to call a heatmap using that confusion_matrix that we just came up with. We want to annotate it so we can see the actual numbers, and all these different defaults that we saw before just to get the confusion_matrix visualization that we'd like with the labels for not churn and churn, the xticklabels, yticklabels, and so on. We saw all these labels before, and we can see here the confusion_matrix between the ground truth. So for those that did not churn, we predicted fairly well at 1,081 out of those that we're actually not churn. So there's only 21 predicted incorrect. But out of the predicted of not churn, 66 of them actually did churn. So we didn't do as precise for our predictions of not churn. Then for the ground truth of churn, we see that 66 out of our total were incorrect and that's why we had that low recall score that we saw earlier. Now let's print out our ROC-AUC curves. Now the way this is going to work. Here, we're going to actually have two different subplots. So we're going to have an axLists. So two bounding boxes in which we put our plots. Starting with ax equals axList 0. We are going to get our ROC curve. What you'll output the false positive rate. The true positive rate, which are the two values we want to plot against one another, as well as the different thresholds. So if you call the ROC-AUC is going to plot across multiple different thresholds, what the actual class, how well it was able to predict each one of the classes. We pass in that y test, and then we pass in our probability values, which we calculated earlier, again, just for that positive class. We're then going to plot the false positive rate on our x-axis, the true positive rate on our y-axis, using one of the colors from our color scheme that we have, and we're setting our line with here equal to five. We're then going to plot, make sure that's between zero and one. Just a dashed line to create that diagonal from 01 to 01 with a line width equal to 0.3. We're then going to set each one of our xlabel, ylabel, our xlimits, ylimits, and the full title, and we're going to pass a grid on top of that object. Then for the precision_recall_curve, we're moving to axList 1. We call that precision_recall_curve on our y tests and, our y_prob for just that second column. We get our output of precision recall, as well as our thresholds, we don't need that here. We're going to call the plot on recall and precision. So recall the x-axis, precision the y-axis, and we should see that trade-off. Again, using many of the same defaults that we saw before, we set our xlabel, our ylabel, plot a grid and so on on our, again, second bounding box here. So let's run this. We see our ROC curve. This diagonal is just going to be as good as a 50-50 split. With a fairly strong curve. Then we see in the other end that precision recall trade-off. Again, a pretty strong curve. We are doing a fairly good job in regards to our prediction here. Now to get our feature importances, all we have to do is call model.feature_importances. That will give us a value where a larger value means is more important, and a smaller value means it's less important. We're going to create a series with the index equal to each one of our different feature columns, and each one of our values are equal to the respective feature importances, and then we're going to sort those values. We'll then just going to call that pandas.Series.plot. We're going to create a bar plot, set our ylabel, and we run this, and we see that satisfaction is by far the most important feature, which makes sense as if someone's more satisfied, they're less likely to churn. As well as that correlation that we saw earlier, where that was the most heavily weighted point in regards to which one of the different values correlated most with churn value. Now that closes out our notebook here on bagging. From here, we will go back to our lecture and talk about another ensemble method, which is boosting. All right. I'll see you there.