Lecturer: Riccardo Tommasini, PhD
Special Thanks to Emanuele Della Valle and Marco Brambilla from Politecnico di Milano to letting me "steal" some of their great slides.
| Date | Title | Material | Mandatory Reads | Extras |
|---|---|---|---|---|
| 01/09 | Course Intro | Slides - pdf slide 45-109) | ||
| 03/09 | Data Modeling | Slides - pdf slide 1-44 | Chp 4 p111-127, Chp 5 p151-156, Chp 6 p199-205 of [3] | |
| 10/09 | DM for Relational Databases | Slides - pdf slide 45-109 | Chp 2, 6, and 7 (Normal Forms) of [1] | Relational Model |
| 10/09 | DM for Data Warehouse | Slides - pdfslide 109-118 | pdf video | Chp 2 of [2] |
| 17/09 | DM for Big Data | Slides - pdf | Chp 2 of [3], video | paper |
| 17/09 | Key Value Stores | Slides | ||
| Column Oriented Databases | ||||
| Document Databases | ||||
| Graph Databases | ||||
| Data Engineering Pipelines | Chp 1 of [3] | |||
| Keynote TBA | ||||
| Streaming Data | Chp 11 of [3] | |||
| Data Wrangling |
| Date | Title | Material | Reads | Videos | Assignment | Notes |
|---|---|---|---|---|---|---|
| 07-8/09 | Docker | Slides - Lab Branch | Video GP1 Video GP2 | QA GP2 only | ||
| 14-15 /09 | Modeling and Querying Relational Data with Postgres | Slides | Chp 32 of [1]§ | Video | ||
| 21-22 /09 | Modeling and Querying Key Value Data with Redis | Slides | Video | |||
| 28-29/09 | Modeling and Querying Document Data with MongoDB | |||||
| 5-6/10 | Modeling and Querying Graph Data with Neo4J | |||||
| Data Ingestion with Apache Kafka | ||||||
| Apache Airflow Data Pipelines | ||||||
| Stream Processing with Kafka Streams | ||||||
| Stream Processing with KSQL | ||||||
| Data Cleansing | ||||||
| Augmentation |
- Modeling and Querying RDF data: SPARQL
- Domain Driven Design: a summary
- Event Sourcing: a summary
- Data Pipelines with Luigi
- Data Pipelines with Apachi Nifi
- Data Processing with Apache Flink
- What is (Big) Data?
- The Role of Data Engineer
- Data Modeling
- Data Replication
- Data Partitioning
- Transactions
- Relational Data
- NoSQL
- Document
- Graph
- Data Warehousing
- Star and Snowflake schemas
- Data Vault
- (Big) Data Pipelines
- Big Data Systems Architectures
- ETL and Data Pipelines
- Best Practices and Anti-Patterns
- Batch vs Streaming Processing
- Data Cleansing
- Data Augumentation
- [1] Database System Concepts 7th Edition Avi Silberschatz Henry F. Korth S. Sudarshan McGraw-Hill ISBN 9780078022159
- [2] The Data Warehouse Toolkit - The Definitive Guide to Dimensional Modeling Third Edition Ralph Kimball Margy Ross
- [3] Designing Data-Intensive Applications - Martin Kleppmann
- [4] Designing Event-Driven Systems
[[slides/Slides]]

