Loading...

Scaling mixtures of Gaussians for document clustering

Course video 46 of 78

In k-means, observations are each hard-assigned to a single cluster, and these assignments are based just on the cluster centers, rather than also incorporating shape information. In our second module on clustering, you will perform probabilistic model-based clustering that provides (1) a more descriptive notion of a "cluster" and (2) accounts for uncertainty in assignments of datapoints to clusters via "soft assignments". You will explore and implement a broadly useful algorithm called expectation maximization (EM) for inferring these soft assignments, as well as the model parameters. To gain intuition, you will first consider a visually appealing image clustering task. You will then cluster Wikipedia articles, handling the high-dimensionality of the tf-idf document representation considered.

À propos de Coursera

Cours, Spécialisations et Diplômes en ligne enseignés par des enseignants du plus haut niveau provenant des meilleurs universités et établissements d'enseignement du monde.

Community
Join a community of 40 million learners from around the world
Certificate
Earn a skill-based course certificate to apply your knowledge
Career
Gain confidence in your skills and further your career