À propos de ce cours
4.7
402 notes
87 avis
Spécialisation
100 % en ligne

100 % en ligne

Commencez dès maintenant et apprenez aux horaires qui vous conviennent.
Dates limites flexibles

Dates limites flexibles

Réinitialisez les dates limites selon votre disponibilité.
Niveau avancé

Niveau avancé

Heures pour terminer

Approx. 48 heures pour terminer

Recommandé : 6-10 hours/week...
Langues disponibles

Anglais

Sous-titres : Anglais

Compétences que vous acquerrez

Data AnalysisFeature ExtractionFeature EngineeringXgboost
Spécialisation
100 % en ligne

100 % en ligne

Commencez dès maintenant et apprenez aux horaires qui vous conviennent.
Dates limites flexibles

Dates limites flexibles

Réinitialisez les dates limites selon votre disponibilité.
Niveau avancé

Niveau avancé

Heures pour terminer

Approx. 48 heures pour terminer

Recommandé : 6-10 hours/week...
Langues disponibles

Anglais

Sous-titres : Anglais

Programme du cours : ce que vous apprendrez dans ce cours

Semaine
1
Heures pour terminer
6 heures pour terminer

Introduction & Recap

This week we will introduce you to competitive data science. You will learn about competitions' mechanics, the difference between competitions and a real life data science, hardware and software that people usually use in competitions. We will also briefly recap major ML models frequently used in competitions....
Reading
8 videos (Total 46 min), 7 lectures, 6 quiz
Video8 vidéos
Meet your lecturers2 min
Course overview7 min
Competition Mechanics6 min
Kaggle Overview [screencast]7 min
Real World Application vs Competitions5 min
Recap of main ML algorithms9 min
Software/Hardware Requirements5 min
Reading7 lectures
Welcome!10 min
Week 1 overview10 min
Disclaimer10 min
Explanation for quiz questions10 min
Additional Materials and Links10 min
Explanation for quiz questions10 min
Additional Material and Links10 min
Quiz5 exercices pour s'entraîner
Practice Quiz8 min
Recap8 min
Recap12 min
Software/Hardware6 min
Graded Soft/Hard Quiz8 min
Heures pour terminer
2 heures pour terminer

Feature Preprocessing and Generation with Respect to Models

In this module we will summarize approaches to work with features: preprocessing, generation and extraction. We will see, that the choice of the machine learning model impacts both preprocessing we apply to the features and our approach to generation of new ones. We will also discuss feature extraction from text with Bag Of Words and Word2vec, and feature extraction from images with Convolution Neural Networks....
Reading
7 videos (Total 73 min), 4 lectures, 4 quiz
Video7 vidéos
Numeric features13 min
Categorical and ordinal features10 min
Datetime and coordinates8 min
Handling missing values10 min
Bag of words10 min
Word2vec, CNN13 min
Reading4 lectures
Explanation for quiz questions10 min
Additional Material and Links10 min
Explanation for quiz questions10 min
Additional Material and Links10 min
Quiz4 exercices pour s'entraîner
Feature preprocessing and generation with respect to models8 min
Feature preprocessing and generation with respect to models8 min
Feature extraction from text and images8 min
Feature extraction from text and images8 min
Heures pour terminer
29 minutes pour terminer

Final Project Description

This is just a reminder, that the final project in this course is better to start soon! The final project is in fact a competition, in this module you can find an information about it....
Reading
1 video (Total 4 min), 2 lectures
Video1 vidéo
Reading2 lectures
Final project10 min
Final project advice #110 min
Semaine
2
Heures pour terminer
2 heures pour terminer

Exploratory Data Analysis

We will start this week with Exploratory Data Analysis (EDA). It is a very broad and exciting topic and an essential component of solving process. Besides regular videos you will find a walk through EDA process for Springleaf competition data and an example of prolific EDA for NumerAI competition with extraordinary findings....
Reading
8 videos (Total 80 min), 2 lectures, 1 quiz
Video8 vidéos
Building intuition about the data6 min
Exploring anonymized data15 min
Visualizations11 min
Dataset cleaning and other things to check7 min
Springleaf competition EDA I8 min
Springleaf competition EDA II16 min
Numerai competition EDA6 min
Reading2 lectures
Week 2 overview10 min
Additional material and links10 min
Quiz1 exercice pour s'entraîner
Exploratory data analysis12 min
Heures pour terminer
2 heures pour terminer

Validation

In this module we will discuss various validation strategies. We will see that the strategy we choose depends on the competition setup and that correct validation scheme is one of the bricks for any winning solution. ...
Reading
4 videos (Total 51 min), 3 lectures, 2 quiz
Video4 vidéos
Validation strategies7 min
Data splitting strategies14 min
Problems occurring during validation20 min
Reading3 lectures
Validation strategies10 min
Comments on quiz10 min
Additional material and links10 min
Quiz2 exercices pour s'entraîner
Validation8 min
Validation8 min
Heures pour terminer
5 heures pour terminer

Data Leakages

Finally, in this module we will cover something very unique to data science competitions. That is, we will see examples how it is sometimes possible to get a top position in a competition with a very little machine learning, just by exploiting a data leakage. ...
Reading
3 videos (Total 26 min), 3 lectures, 3 quiz
Video3 vidéos
Leaderboard probing and examples of rare data leaks9 min
Expedia challenge9 min
Reading3 lectures
Comments on quiz10 min
Additional material and links10 min
Final project advice #210 min
Quiz1 exercice pour s'entraîner
Data leakages8 min
Semaine
3
Heures pour terminer
3 heures pour terminer

Metrics Optimization

This week we will first study another component of the competitions: the evaluation metrics. We will recap the most prominent ones and then see, how we can efficiently optimize a metric given in a competition....
Reading
8 videos (Total 83 min), 3 lectures, 2 quiz
Video8 vidéos
Regression metrics review I14 min
Regression metrics review II8 min
Classification metrics review20 min
General approaches for metrics optimization6 min
Regression metrics optimization10 min
Classification metrics optimization I7 min
Classification metrics optimization II6 min
Reading3 lectures
Week 3 overview10 min
Comments on quiz10 min
Additional material and links10 min
Quiz2 exercices pour s'entraîner
Metrics12 min
Metrics12 min
Heures pour terminer
4 heures pour terminer

Advanced Feature Engineering I

In this module we will study a very powerful technique for feature generation. It has a lot of names, but here we call it "mean encodings". We will see the intuition behind them, how to construct them, regularize and extend them. ...
Reading
3 videos (Total 27 min), 2 lectures, 2 quiz
Video3 vidéos
Regularization7 min
Extensions and generalizations10 min
Reading2 lectures
Comments on quiz10 min
Final project advice #310 min
Quiz1 exercice pour s'entraîner
Mean encodings8 min
Semaine
4
Heures pour terminer
3 heures pour terminer

Hyperparameter Optimization

In this module we will talk about hyperparameter optimization process. We will also have a special video with practical tips and tricks, recorded by four instructors....
Reading
6 videos (Total 86 min), 4 lectures, 2 quiz
Video6 vidéos
Hyperparameter tuning II12 min
Hyperparameter tuning III13 min
Practical guide16 min
KazAnova's competition pipeline, part 118 min
KazAnova's competition pipeline, part 217 min
Reading4 lectures
Week 4 overview10 min
Comments on quiz10 min
Additional material and links10 min
Additional materials and links10 min
Quiz2 exercices pour s'entraîner
Practice quiz6 min
Graded quiz8 min
Heures pour terminer
4 heures pour terminer

Advanced feature engineering II

In this module we will learn about a few more advanced feature engineering techniques....
Reading
4 videos (Total 22 min), 2 lectures, 2 quiz
Video4 vidéos
Matrix factorizations6 min
Feature Interactions5 min
t-SNE5 min
Reading2 lectures
Comments on quiz10 min
Additional Materials and Links10 min
Quiz1 exercice pour s'entraîner
Graded Advanced Features II Quiz12 min
Heures pour terminer
10 heures pour terminer

Ensembling

Nowadays it is hard to find a competition won by a single model! Every winning solution incorporates ensembles of models. In this module we will talk about the main ensembling techniques in general, and, of course, how it is better to ensemble the models in practice. ...
Reading
8 videos (Total 92 min), 4 lectures, 4 quiz
Video8 vidéos
Bagging9 min
Boosting16 min
Stacking16 min
StackNet14 min
Ensembling Tips and Tricks14 min
CatBoost 17 min
CatBoost 27 min
Reading4 lectures
Validation schemes for 2-nd level models10 min
Comments on quiz10 min
Additional materials and links10 min
Final project advice #410 min
Quiz2 exercices pour s'entraîner
Ensembling8 min
Ensembling12 min
4.7
87 avisChevron Right
Orientation de carrière

33%

a commencé une nouvelle carrière après avoir terminé ces cours
Avantage de carrière

83%

a bénéficié d'un avantage concret dans sa carrière grâce à ce cours

Meilleurs avis

par MSMar 29th 2018

Top Kagglers gently introduce one to Data Science Competitions. One will have a great chance to learn various tips and tricks and apply them in practice throughout the course. Highly recommended!

par MMNov 10th 2017

This course is fantastic. It's chock full of practical information that is presented clearly and concisely. I would like to thank the team for sharing their knowledge so generously.

Enseignants

Avatar

Dmitry Ulyanov

Visiting lecturer
HSE Faculty of Computer Science
Avatar

Alexander Guschin

Visiting lecturer at HSE, Lecturer at MIPT
HSE Faculty of Computer Science
Avatar

Mikhail Trofimov

Visiting lecturer
HSE Faculty of Computer Science
Avatar

Dmitry Altukhov

Visiting lecturer
HSE Faculty of Computer Science
Avatar

Marios Michailidis

Research Data Scientist
H2O.ai

À propos de National Research University Higher School of Economics

National Research University - Higher School of Economics (HSE) is one of the top research universities in Russia. Established in 1992 to promote new research and teaching in economics and related disciplines, it now offers programs at all levels of university education across an extraordinary range of fields of study including business, sociology, cultural studies, philosophy, political science, international relations, law, Asian studies, media and communications, IT, mathematics, engineering, and more. Learn more on www.hse.ru...

À propos de la Spécialisation Advanced Machine Learning

This specialization gives an introduction to deep learning, reinforcement learning, natural language understanding, computer vision and Bayesian methods. Top Kaggle machine learning practitioners and CERN scientists will share their experience of solving real-world problems and help you to fill the gaps between theory and practice. Upon completion of 7 courses you will be able to apply modern machine learning methods in enterprise and understand the caveats of real-world data and settings....
Advanced Machine Learning

Foire Aux Questions

  • Une fois que vous êtes inscrit(e) pour un Certificat, vous pouvez accéder à toutes les vidéos de cours, et à tous les quiz et exercices de programmation (le cas échéant). Vous pouvez soumettre des devoirs à examiner par vos pairs et en examiner vous-même uniquement après le début de votre session. Si vous préférez explorer le cours sans l'acheter, vous ne serez peut-être pas en mesure d'accéder à certains devoirs.

  • Lorsque vous vous inscrivez au cours, vous bénéficiez d'un accès à tous les cours de la Spécialisation, et vous obtenez un Certificat lorsque vous avez réussi. Votre Certificat électronique est alors ajouté à votre page Accomplissements. À partir de cette page, vous pouvez imprimer votre Certificat ou l'ajouter à votre profil LinkedIn. Si vous souhaitez seulement lire et visualiser le contenu du cours, vous pouvez accéder gratuitement au cours en tant qu'auditeur libre.

D'autres questions ? Visitez le Centre d'Aide pour les Etudiants.