The Path to Insights: Data Models and Pipelines
- Offered byCoursera
The Path to Insights: Data Models and Pipelines at Coursera Overview
Duration | 23 hours |
Start from | Start Now |
Total fee | Free |
Mode of learning | Online |
Difficulty level | Advanced |
Official Website | Explore Free Course |
Credential | Certificate |
The Path to Insights: Data Models and Pipelines at Coursera Highlights
- Flexible deadlines Reset deadlines in accordance to your schedule.
- Shareable Certificate Earn a Certificate upon completion
- 100% online Start instantly and learn at your own schedule.
- Coursera Labs Includes hands on learning projects. Learn more about Coursera Labs External Link
- Advanced Level
- Approx. 23 hours to complete
- English Subtitles: English
The Path to Insights: Data Models and Pipelines at Coursera Course details
- This is the second of three courses in the Google Business Intelligence Certificate. In this course, you'll explore data modeling and how databases are designed. Then you’ll learn about extract, transform, load (ETL) processes that extract data from source systems, transform it into formats that enable analysis, and drive business processes and goals.
- Google employees who currently work in BI will guide you through this course by providing hands-on activities that simulate job tasks, sharing examples from their day-to-day work, and helping you build business intelligence skills to prepare for a career in the field.
- Learners who complete the three courses in this certificate program will have the skills needed to apply for business intelligence jobs. This certificate program assumes prior knowledge of foundational analytical principles, skills, and tools covered in the Google Data Analytics Certificate.
- By the end of this course, you will:
- -Determine which data models are appropriate for different business requirements
- -Describe the difference between creating and interacting with a data model
- -Create data models to address different types of questions
- -Explain the parts of the extract, transform, load (ETL) process and tools used in ETL
- -Understand extraction processes and tools for different data storage systems
- -Design an ETL process that meets organizational and stakeholder needs
- -Design data pipelines to automate BI processes
The Path to Insights: Data Models and Pipelines at Coursera Curriculum
Data models and pipelines
Introduction to Course 2
Ed: Overcome imposter syndrome
Welcome to week 1
Data modeling, design patterns, and schemas
Get the facts with dimensional models
Dimensional models with star and snowflake schemas
Different data types, different databases
The shape of the data
Design useful database schemas
Data pipelines and the ETL process
Maximize data through the ETL process
Choose the right tool for the job
Introduction to Dataflow
Coding with Python
Gather information from stakeholders
Wrap-up
[Optional] Review Google Data Analytics Certificate content about data types
[Optional] Review Google Data Analytics Certificate content about primary and foreign keys
[Optional] Review Google Data Analytics Certificate content about BigQuery
[Optional] Review Google Data Analytics Certificate content about SQL
Helpful resources and tips
Course 2 overview
Design efficient database systems with schemas
Database comparison checklist
Four key elements of database schemas
Review a database schema
Business intelligence tools and their applications
ETL-specific tools and their applications
Guide to Dataflow
Python applications and resources
Merge data from multiple sources with BigQuery
Unify data with target tables
Activity Exemplar: Create a target table in BigQuery
Case study: Wayfair - Working with stakeholders to create a pipeline
Glossary terms from week 1
[Optional] Review Google Data Analytics Certificate content about SQL best practices
Test your knowledge: Data modeling, schemas, and databases
Test your knowledge: Choose the right database
Test your knowledge: How data moves
[Optional] Activity: Create a Google Cloud account
[Optional] Activity: Create a streaming pipeline in Dataflow
Activity: Set up a sandbox and query a public dataset in BigQuery
Activity: Create a target table in BigQuery
Weekly challenge 1
Dynamic database design
Welcome to week 2
Data marts, data lakes, and the ETL process
The five factors of database performance
Optimize database performance
The five factors in action
Wrap-up
ETL versus ELT
A guide to the five factors of database performance
Indexes, partitions, and other ways to optimize
Activity Exemplar: Partition data and create indexes in BigQuery
Case study: Deloitte - Optimizing outdated database systems
Determine the most efficient query
Glossary terms from week 2
Activity: Partition data and create indexes in BigQuery
Test your knowledge: Database performance
Weekly challenge 2
Optimize ETL processes
Welcome to week 3
The importance of quality testing
Mana: Quality data is useful data
Conformity from source to destination
Check your schema
Verify business rules
Burak: Evolving technology
Wrap-up
[Optional] Review Google Data Analytics Certificate content about data integrity
[Optional] Review Google Data Analytics Certificate content about metadata
Seven elements of quality testing
Monitor data quality with SQL
Sample data dictionary and data lineage
Schema-validation checklist
Activity Exemplar: Evaluate a schema using a validation checklist
Business rules
Database performance testing in an ETL context
Defend against known issues
Case study: FeatureBase, Part 2: Alternative solutions to pipeline systems
Glossary terms from week 3
Test your knowledge: Optimize pipelines and ETL processes
Activity: Evaluate a schema using a validation checklist
Test your knowledge: Data schema validation
Test your knowledge: Business rules and performance testing
Weekly challenge 3
Course 2 end-of-course project
Welcome to week 4
Continue your end-of-course project
Tips for ongoing success with your end-of-course project
Luis: Tips for interview preparation
Course wrap-up
Explore Course 2 end-of-course project scenarios
Course 2 workplace scenario overview: Cyclistic
Cyclistic datasets
Observe the Cyclistic team in action
Activity Exemplar: Create your target table for Cyclistic
Course 2 workplace scenario overview: Google Fiber
Google Fiber datasets
[Optional] Merge Google Fiber datasets in Tableau
Activity Exemplar: Create your target table for Google Fiber
Course 2 glossary
Get started on Course 3
Activity: Create your target table for Cyclistic
Activity: Create your target table for Google Fiber
Assess your Course 2 end-of-course project