Exploratory Data Analysis

4.7
3,538 ratings
543 reviews

Course 4 of 10 in the Data Science Specialization

This course covers the essential exploratory techniques for summarizing data. These techniques are typically applied before formal modeling commences and can help inform the development of more complex statistical models. Exploratory techniques are also important for eliminating or sharpening potential hypotheses about the world that can be addressed by the data. We will cover in detail the plotting systems in R as well as some of the basic principles of constructing data graphics. We will also cover some of the common multivariate statistical techniques used to visualize high-dimensional data.
Globe

Cours en ligne à 100 %

Commencez dès maintenant et apprenez aux horaires qui vous conviennent.
Clock

Approx. 15 heures pour terminer

Recommandé : 5 hours/week
Comment Dots

English

Sous-titres : English, Chinese (Simplified)

Ce que vous allez apprendre

  • Check
    Apply cluster analysis techniques to locate patterns in data
  • Check
    Make graphical displays of very high dimensional data
  • Check
    Understand analytic graphics and the base plotting system in R
  • Check
    Use advanced graphing systems such as the Lattice system

Compétences que vous acquerrez

R ProgrammingCluster AnalysisData VisualizationData Analysis
Globe

Cours en ligne à 100 %

Commencez dès maintenant et apprenez aux horaires qui vous conviennent.
Clock

Approx. 15 heures pour terminer

Recommandé : 5 hours/week
Comment Dots

English

Sous-titres : English, Chinese (Simplified)

Syllabus - What you will learn from this course

1

Section
Clock
20 hours to complete

Week 1

This week covers the basics of analytic graphics and the base plotting system in R. We've also included some background material to help you install R if you haven't done so already. ...
Reading
15 videos (Total 109 min), 6 readings, 7 quizzes
Video15 videos
Installing R on Windows (3.2.1)3m
Installing R on a Mac (3.2.1)1m
Installing R Studio (Mac)3m
Setting Your Working Directory (Windows)7m
Setting Your Working Directory (Mac)7m
Principles of Analytic Graphics12m
Exploratory Graphs (part 1)9m
Exploratory Graphs (part 2) 5m
Plotting Systems in R9m
Base Plotting System (part 1)11m
Base Plotting System (part 2)6m
Base Plotting Demonstration16m
Graphics Devices in R (part 1)5m
Graphics Devices in R (part 2)7m
Reading6 readings
Welcome to Exploratory Data Analysis10m
Syllabus10m
Pre-Course Survey10m
Exploratory Data Analysis with R Book10m
The Art of Data Science10m
Practical R Exercises in swirl Part 110m
Quiz1 practice exercises
Week 1 Quiz20m

2

Section
Clock
17 hours to complete

Week 2

Welcome to Week 2 of Exploratory Data Analysis. This week covers some of the more advanced graphing systems available in R: the Lattice system and the ggplot2 system. While the base graphics system provides many important tools for visualizing data, it was part of the original R system and lacks many features that may be desirable in a plotting system, particularly when visualizing high dimensional data. The Lattice and ggplot2 systems also simplify the laying out of plots making it a much less tedious process....
Reading
7 videos (Total 61 min), 1 reading, 6 quizzes
Video7 videos
Lattice Plotting System (part 2)6m
ggplot2 (part 1)6m
ggplot2 (part 2)13m
ggplot2 (part 3)9m
ggplot2 (part 4)10m
ggplot2 (part 5)8m
Reading1 readings
Practical R Exercises in swirl Part 210m
Quiz1 practice exercises
Week 2 Quiz20m

3

Section
Clock
13 hours to complete

Week 3

Welcome to Week 3 of Exploratory Data Analysis. This week covers some of the workhorse statistical methods for exploratory analysis. These methods include clustering and dimension reduction techniques that allow you to make graphical displays of very high dimensional data (many many variables). We also cover novel ways to specify colors in R so that you can use color as an important and useful dimension when making data graphics. All of this material is covered in chapters 9-12 of my book Exploratory Data Analysis with R....
Reading
12 videos (Total 77 min), 1 reading, 4 quizzes
Video12 videos
Hierarchical Clustering (part 2)5m
Hierarchical Clustering (part 3)7m
K-Means Clustering (part 1)5m
K-Means Clustering (part 2)4m
Dimension Reduction (part 1)7m
Dimension Reduction (part 2)9m
Dimension Reduction (part 3)6m
Working with Color in R Plots (part 1)4m
Working with Color in R Plots (part 2)7m
Working with Color in R Plots (part 3)6m
Working with Color in R Plots (part 4)3m
Reading1 readings
Practical R Exercises in swirl Part 310m

4

Section
Clock
6 hours to complete

Week 4

This week, we'll look at two case studies in exploratory data analysis. The first involves the use of cluster analysis techniques, and the second is a more involved analysis of some air pollution data. How one goes about doing EDA is often personal, but I'm providing these videos to give you a sense of how you might proceed with a specific type of dataset. ...
Reading
2 videos (Total 55 min), 2 readings, 2 quizzes
Video2 videos
Air Pollution Case Study40m
Reading2 readings
Practical R Exercises in swirl Part 410m
Post-Course Survey10m
4.7
Direction Signs

37%

started a new career after completing these courses
Briefcase

83%

got a tangible career benefit from this course
Money

18%

got a pay increase or promotion

Top Reviews

By Y(Sep 24th 2017

Very good course! It provide me the foundation in learning how to plot and interpret data. This will definitely strengthen my "R programming" to generate publication type figure for my genomics data!

By IAJan 18th 2016

Very nice course, plotting data to explore and understand various features and their relationship is the key in any research domain, and this course teaches the skill required to achieve this.

Instructors

Avatar

Roger D. Peng, PhD

Associate Professor, Biostatistics
Avatar

Jeff Leek, PhD

Associate Professor, Biostatistics
Avatar

Brian Caffo, PhD

Professor, Biostatistics

About Johns Hopkins University

The mission of The Johns Hopkins University is to educate its students and cultivate their capacity for life-long learning, to foster independent and original research, and to bring the benefits of discovery to the world....

Frequently Asked Questions

  • Once you enroll for a Certificate, you’ll have access to all videos, quizzes, and programming assignments (if applicable). Peer review assignments can only be submitted and reviewed once your session has begun. If you choose to explore the course without purchasing, you may not be able to access certain assignments.

  • If you pay for this course, you will have access to all of the features and content you need to earn a Course Certificate. If you complete the course successfully, your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile. Note that the Course Certificate does not represent official academic credit from the partner institution offering the course.

  • Yes! Coursera provides financial aid to learners who would like to complete a course but cannot afford the course fee. To apply for aid, select "Learn more and apply" in the Financial Aid section below the "Enroll" button. You'll be prompted to complete a simple application; no other paperwork is required.

More questions? Visit the Learner Help Center