One of the cool things about Pandas is the describe method. We can use it to get the distribution of values within a data frame. One of the downsides of that is that we need to fetch all of the data from the database in order to calculate that information on the client. And that can be really slow. Here, I present to you a way to do it entirely with an aggregation. First, we have this remove_id variable, which is just a predict stage that removes the _id. Next, we have the findkeys variable, which is a group stage that converts every document it comes across to its current keys and then groups on that collection of keys. And then we have merge keys, which is a predict stage that does just that. It uses reduce to go over the return from this group, and flattens the list where we only end up getting back a unique list of keys encountered in our documents. stats_by_key is where most of magic happens. We have the stats.helper function that really just cleans up data for us. Down here, we have one of the we're going to use, which is a bucketAuto stage, followed by a group stage. This gives us our percentile information as well as some minimum, maximum information. Then, in another facet, we're going to use a group stage where we get our count, our standard deviation, and the mean. Here is where it's all combined together. You can see where we return the count, mean, standard deviation, minimum, 25th percentile, 50th, 75th, and max. Let's see how all this looks when we run it. And here you see an idea of the aggregation pipeline that those two functions above give us. Now, I'm not going to scroll down through this, but I can promise you it is very long. Here's the information that it gives us. It looks a lot like the Pandas describe method. And here is the Panda describe method, so that you can compare the two. This is a pretty cool feature.