One of the things we can do is modify some of the aesthetics.
So we can highlight different subgroups of the data.
So one of the subgroups can be determined by, so
there's lots of different cars here.
Some of them are front-wheel drive, some of them are rear-wheel drive, and
some of them are four-wheel drive.
So we can separate those observations out by looking at the drv, the drive variable.
And so I've specified the x and y coordinates just like before.
I specified the data frame just like before.
And then another argument I have is the color variable.
And I'm gonna say that the color is mapped to this drive variable, drv.
And all that says is that the different levels of the drive variable
will be each assigned a different color.
And notice I don't specify what those colors are,
they're specified automatically.
So you can see on this new plot here that the front-wheel drive cars are in green,
the rear-wheel drive cars are in blue, and the four-wheel drive cars are in red.
And so you can see that most of the front-wheel drives tend to have
the highest mileage, the four-wheel drive tends to have the lowest mileage, and
the rear-wheel drive is something in the middle.
And I noticed that the legend was placed on the plot automatically and
the color coding the different levels of the factor variable.
I didn't have to do anything special.
And so it's very nicely organized and thought out, and
you don't have to do anything.
Everything is done automatically.