Some Important and Frequently Encountered PDFs

This module gives you additional valuable concepts and skills related to building high-quality models. As you know, a “model” is a description of a process applied to available data (inputs) that produces an estimate of a future and as yet unknown outcome as output. Very often, models for outputs take the form of a probability distribution. This module covers how to estimate probability distributions from data (a “probability histogram”), and how to describe and generate the most useful probability distributions used by data scientists. It also covers in detail how to develop a binary classification model with parameters optimized to maximize the AUC, and how to apply linear regression models when your input consists of multiple types of data for each event. The module concludes with an explanation of “over-fitting” which is the main reason that apparently good predictive models often fail in real life business settings. We conclude with some tips for how you can avoid over-fitting in you own predictive model for the final project – and in real life.

À propos de Coursera

Cours, Spécialisations et Diplômes en ligne enseignés par des enseignants du plus haut niveau provenant des meilleurs universités et établissements d'enseignement du monde.

Join a community of 40 million learners from around the world
Earn a skill-based course certificate to apply your knowledge
Gain confidence in your skills and further your career