À propos de ce cours
4.3
663 notes
142 avis
Spécialisation
100 % en ligne

100 % en ligne

Commencez dès maintenant et apprenez aux horaires qui vous conviennent.
Dates limites flexibles

Dates limites flexibles

Réinitialisez les dates limites selon votre disponibilité.
Heures pour terminer

Approx. 21 heure pour terminer

Recommandé : 4 weeks of study, 6-8 hours/week...
Langues disponibles

Anglais

Sous-titres : Anglais

Compétences que vous acquerrez

Relational AlgebraPython ProgrammingMapreduceSQL
Spécialisation
100 % en ligne

100 % en ligne

Commencez dès maintenant et apprenez aux horaires qui vous conviennent.
Dates limites flexibles

Dates limites flexibles

Réinitialisez les dates limites selon votre disponibilité.
Heures pour terminer

Approx. 21 heure pour terminer

Recommandé : 4 weeks of study, 6-8 hours/week...
Langues disponibles

Anglais

Sous-titres : Anglais

Programme du cours : ce que vous apprendrez dans ce cours

Semaine
1
Heures pour terminer
6 heures pour terminer

Data Science Context and Concepts

Understand the terminology and recurring principles associated with data science, and understand the structure of data science projects and emerging methodologies to approach them. Why does this emerging field exist? How does it relate to other fields? How does this course distinguish itself? What do data science projects look like, and how should they be approached? What are some examples of data science projects? ...
Reading
22 vidéos (Total 125 min), 4 lectures, 1 quiz
Video22 vidéos
Appetite Whetting: Extreme Weather2 min
Appetite Whetting: Digital Humanities8 min
Appetite Whetting: Bibliometrics4 min
Appetite Whetting: Food, Music, Public Health5 min
Appetite Whetting: Public Health cont'd, Earthquakes, Legal4 min
Characterizing Data Science5 min
Characterizing Data Science, cont'd5 min
Distinguishing Data Science from Related Topics4 min
Four Dimensions of Data Science6 min
Tools vs. Abstractions7 min
Desktop Scale vs. Cloud Scale5 min
Hackers vs. Analysts2 min
Structs vs. Stats5 min
Structs vs. Stats cont'd5 min
A Fourth Paradigm of Science3 min
Data-Intensive Science Examples6 min
Big Data and the 3 Vs5 min
Big Data Definitions4 min
Big Data Sources6 min
Course Logistics7 min
Twitter Assignment: Getting Started14 min
Reading4 lectures
Supplementary: Three-Course Reading List10 min
Supplementary: Resources for Learning Python10 min
Supplementary: Class Virtual Machine10 min
Supplementary: Github Instructions10 min
Semaine
2
Heures pour terminer
5 heures pour terminer

Relational Databases and the Relational Algebra

Relational Databases are the workhouse of large-scale data management. Although originally motivated by problems in enterprise operations, they have proven remarkably capable for analytics as well. But most importantly, the principles underlying relational databases are universal in managing, manipulating, and analyzing data at scale. Even as the landscape of large-scale data systems has expanded dramatically in the last decade, relational models and languages have remained a unifying concept. For working with large-scale data, there is no more important programming model to learn....
Reading
24 vidéos (Total 122 min), 1 quiz
Video24 vidéos
From Data Models to Databases4 min
Pre-Relational Databases5 min
Motivating Relational Databases3 min
Relational Databases: Key Ideas4 min
Algebraic Optimization Overview6 min
Relational Algebra Overview4 min
Relational Algebra Operators: Union, Difference, Selection6 min
Relational Algebra Operators: Projection, Cross Product4 min
Relational Algebra Operators: Cross Product cont'd, Join6 min
Relational Algebra Operators: Outer Join4 min
Relational Algebra Operators: Theta-Join4 min
From SQL to RA6 min
Thinking in RA: Logical Query Plans4 min
Practical SQL: Binning Timeseries5 min
Practical SQL: Genomic Intervals6 min
User-Defined Functions3 min
Support for User-Defined Functions4 min
Optimization: Physical Query Plans5 min
Optimization: Choosing Physical Plans4 min
Declarative Languages5 min
Declarative Languages: More Examples4 min
Views: Logical Data Independence5 min
Indexes6 min
Semaine
3
Heures pour terminer
5 heures pour terminer

MapReduce and Parallel Dataflow Programming

The MapReduce programming model (as distinct from its implementations) was proposed as a simplifying abstraction for parallel manipulation of massive datasets, and remains an important concept to know when using and evaluating modern big data platforms. ...
Reading
26 vidéos (Total 122 min), 1 quiz
Video26 vidéos
A Sketch of Algorithmic Complexity5 min
A Sketch of Data-Parallel Algorithms5 min
"Pleasingly Parallel" Algorithms4 min
More General Distributed Algorithms4 min
MapReduce Abstraction4 min
MapReduce Data Model3 min
Map and Reduce Functions2 min
MapReduce Simple Example3 min
MapReduce Simple Example cont'd3 min
MapReduce Example: Word Length Histogram2 min
MapReduce Examples: Inverted Index, Join6 min
Relational Join: Map Phase4 min
Relational Join: Reduce Phase4 min
Simple Social Network Analysis: Counting Friends3 min
Matrix Multiply Overview5 min
Matrix Multiply Illustrated4 min
Shared Nothing Computing4 min
MapReduce Implementation5 min
MapReduce Phases6 min
A Design Space for Large-Scale Data Systems4 min
Parallel and Distributed Query Processing5 min
Teradata Example, MR Extensions5 min
RDBMS vs. MapReduce: Features6 min
RDBMS vs. Hadoop: Grep5 min
RDBMS vs. Hadoop: Select, Aggregate, Join3 min
Semaine
4
Heures pour terminer
3 heures pour terminer

NoSQL: Systems and Concepts

NoSQL systems are purely about scale rather than analytics, and are arguably less relevant for the practicing data scientist. However, they occupy an important place in many practical big data platform architectures, and data scientists need to understand their limitations and strengths to use them effectively....
Reading
36 vidéos (Total 166 min)
Video36 vidéos
NoSQL Roundup4 min
Relaxing Consistency Guarantees3 min
Two-Phase Commit and Consensus Protocols5 min
Eventual Consistency4 min
CAP Theorem4 min
Types of NoSQL Systems4 min
ACID, Major Impact Systems4 min
Memcached: Consistent Hashing2 min
Consistent Hashing, cont'd4 min
DynamoDB: Vector Clocks5 min
Vector Clocks, cont'd5 min
CouchDB Overview4 min
CouchB Views3 min
BigTable Overview5 min
BigTable Implementation5 min
HBase, Megastore3 min
Spanner5 min
Spanner cont'd, Google Systems6 min
MapReduce-based Systems5 min
Bringing Back Joins4 min
NoSQL Rebuttal4 min
Almost SQL: Pig4 min
Pig Architecture and Performance3 min
Data Model3 min
Load, Filter, Group5 min
Group, Distinct, Foreach, Flatten5 min
CoGroup, Join3 min
Join Algorithms3 min
Skew5 min
Other Commands3 min
Evaluation Walkthrough3 min
Review6 min
Context3 min
Spark Examples5 min
RDDs, Benefits6 min
Heures pour terminer
2 heures pour terminer

Graph Analytics

Graph-structured data are increasingly common in data science contexts due to their ubiquity in modeling the communication between entities: people (social networks), computers (Internet communication), cities and countries (transportation networks), or corporations (financial transactions). Learn the common algorithms for extracting information from graph data and how to scale them up. ...
Reading
21 vidéo (Total 91 min)
Video21 vidéo
Structural Analysis4 min
Degree Histograms, Structure of the Web4 min
Connectivity and Centrality4 min
PageRank3 min
PageRank in more Detail3 min
Traversal Tasks: Spanning Trees and Circuits5 min
Traversal Tasks: Maximum Flow1 min
Pattern Matching6 min
Querying Edge Tables4 min
Relational Algebra and Datalog for Graphs4 min
Querying Hybrid Graph/Relational Data3 min
Graph Query Example: NSA6 min
Graph Query Example: Recursion4 min
Evaluation of Recursive Programs3 min
Recursive Queries in MapReduce4 min
The End-Game Problem3 min
Representation: Edge Table, Adjacency List4 min
Representation: Adjacency Matrix2 min
PageRank in MapReduce5 min
PageRank in Pregel5 min
4.3
142 avisChevron Right

Meilleurs avis

par HAJan 11th 2016

Great course that strikes a balance between teaching general principles and concepts, and providing hands-on technical skills and practice.\n\nThe lessons are well designed and clearly conveyed.

par SLMay 28th 2016

I like the breadth of coverage of this class. Each of the exercise is a gem in that I get to learn something new also. I would highly recommend this even to experience practitioner also.

Enseignant

Avatar

Bill Howe

Director of Research
Scalable Data Analytics

À propos de University of Washington

Founded in 1861, the University of Washington is one of the oldest state-supported institutions of higher education on the West Coast and is one of the preeminent research universities in the world....

À propos de la Spécialisation Data Science at Scale

Learn scalable data management, evaluate big data technologies, and design effective visualizations. This Specialization covers intermediate topics in data science. You will gain hands-on experience with scalable SQL and NoSQL data management solutions, data mining algorithms, and practical statistical and machine learning concepts. You will also learn to visualize data and communicate results, and you’ll explore legal and ethical issues that arise in working with big data. In the final Capstone Project, developed in partnership with the digital internship platform Coursolve, you’ll apply your new skills to a real-world data science project....
Data Science at Scale

Foire Aux Questions

  • Une fois que vous êtes inscrit(e) pour un Certificat, vous pouvez accéder à toutes les vidéos de cours, et à tous les quiz et exercices de programmation (le cas échéant). Vous pouvez soumettre des devoirs à examiner par vos pairs et en examiner vous-même uniquement après le début de votre session. Si vous préférez explorer le cours sans l'acheter, vous ne serez peut-être pas en mesure d'accéder à certains devoirs.

  • Lorsque vous vous inscrivez au cours, vous bénéficiez d'un accès à tous les cours de la Spécialisation, et vous obtenez un Certificat lorsque vous avez réussi. Votre Certificat électronique est alors ajouté à votre page Accomplissements. À partir de cette page, vous pouvez imprimer votre Certificat ou l'ajouter à votre profil LinkedIn. Si vous souhaitez seulement lire et visualiser le contenu du cours, vous pouvez accéder gratuitement au cours en tant qu'auditeur libre.

D'autres questions ? Visitez le Centre d'Aide pour les Etudiants.