Advanced Data Engineering

Offered byCoursera

Advanced Data Engineering
at
Coursera
Overview

Equip participants with the skills to manage the increasing volume, velocity, and variety of data effectively

Duration	23 hours
Start from	Start Now
Total fee	Free
Mode of learning	Online
Official Website	Explore Free Course
Credential	Certificate

Advanced Data Engineering
at
Coursera
Highlights

Earn a certificate from Coursera
Add to your LinkedIn profile
14 quizzes

Advanced Data Engineering
at
Coursera
Course details

Skills you will learn

Data Modeling Python Data Processing Software Testing Linux Apache Data analysis MySQL

What are the course deliverables?

Create and manage data pipelines and their lifecycle
Connect and work with message queues to manage data processing
Use vector, graph, and key/value databases for data storage at scale

More about this course

In this advanced course, you will gain practical expertise in scaling data engineering systems using cutting-edge tools and techniques
This course is designed for data scientists, data engineers, and anyone with a foundational understanding of data handling who desires to escalate their skills to handle larger, more complex datasets efficiently
Throughout the course, you'll master the application of technologies such as Celery with RabbitMQ for scalable data consumption, Apache Airflow for optimized workflow management, and Vector and Graph databases for robust data management at scale
The course will culminate with hands-on projects that offer real-world experience, where you'll put your acquired skills to test in solving data engineering challenges
You will not only learn to create scalable data systems but also to analyze their performance and make necessary adjustments for optimum results
This invaluable experience in advanced data engineering techniques will prepare you for the demanding tasks of handling massive datasets, streamlining complex workflows, and optimizing data operations for businesses of any scale

Advanced Data Engineering
at
Coursera
Curriculum

Queues and Databases-RabbitMQ and MySQL

Meet your instructor: Alfredo Deza

About this course

Introduction

Overview of Queues

What is Celery?

Use cases for RabbitMQ

Overview of a Flask and Celery application

Summary

Introduction

Configuring Celery with Flask

Connecting Celery with RabbitMQ

Defining a Celery task in Flask

Fire and forget task in Flask

Retrieve values from asynchronous tasks

Summary

MySQL Overview

MySQL from Terminal

Archive and Drop Database

Import external database Sakila

Modify database Sakila

Bash pipelines with MySQL

MySQL to Python Standard Library Web Server

Connect with your instructor

Meet your instructor: Noah Gift

Course structure and discussion etiquette

Key Terms

Introduction to Celery

Using RabbitMQ with Docker

External lab: Start RabbitMQ in a development environment

Key Terms

Build a web app by using Python and Flask

Background tasks with Celery

External lab: Add a new Celery task for RabbitMQ

Key Terms

Getting Started with MySQL

Lesson Reflection

Queues and Databases - Final week quiz

Introduction to RabbitMQ and Flask

RabbitMQ with Celery and Flask

Quiz-MySQL for Data Engineering

Meet and greet (optional)

Linux Hacking with MySQL

Optimizing Workflow Management at Scale with Apache Airflow

Introduction

What is Apache Airflow?

Installing Apache Airflow from PyPI

Using Apache Airflow with Docker

Exploring the Airflow UI

Introduction

Exploring directed acyclic graphs (DAG)

Creating a DAG

Running a backfill

Testing and validation

Summary

Introduction

Identifying a task to build a DAG

Retrieving remote data

Cleaning and normalizing data

Inspecting the UI for results

Summary

Key Terms

What is Apache Airflow

Exploring the Airflow User Interface

External lab: Install Apache Airflow

Lesson Reflection

Key Terms

External lab: Create a DAG

Architecture overview

Lesson Reflection

Key Terms

External Lab: Build a data pipeline for census data

Build Data Pipelines with Apache Airflow

Lesson Reflection

Final Week Quiz-Optimizing Workflow Management at Scale with Apache Airflow

Quiz-Installing Apache Airflow

Quiz-Apache Airflow Fundamentals

Quiz-Creating a pipeline

Achieving Scalability with Vector, Graph, and Key/Value Databases

Picking the proper database

What are vector databases and how they work

Implementing Semantic search

Quickstart Qdrant

Qdrant Rust Client

Vector Database Architectures

Hands-on lab: Enhance Semantic Search

Graph data models and database concepts

Introduction to Amazon Neptune

Graph algorithms: UFC graph centrality in Rust

Kosaraju Community Detection in Graphs

Shortest Path with Graphs

Key Components of Rust CLI Tool

Lab Walkthrough: Building a Rust Graph CLI Tool

Key Terms

What is a Vector Database?

External Lab: Run Quickstart of qdrant

External Lab: Extend Semantic Search

Jaccard index

Lesson Reflection

Key Terms

Rust CLI with Clap

External Lab: Rust Graph CLI Tool

Amazon Neptune

Lesson Reflection

Final Quiz-Achieving Scalability with Vector, Graph, and Key/Value Databases

Quiz-Introduction to Vector Databases

Quiz-Introduction to Graph Databases

Social Media Recommender

Real-world Advanced Data Engineering Projects

Learn AWS CloudShell for Dynamo Development

Learn AWS CodeCatalyst for Dynamo Development

Leveraging AWS CodeWhisperer for Dynamo Development

Create a Table with CLI

Populate a Table With Batching Records

Query a Table with Records

Project Walkthrough

Introduction

Overview of a pipeline requirements

Using SqlAlchemy with Pandas

Persisting data in a task

Reviewing the results

Summary

Key Terms

Amazon CodeCatalyst

Lesson Reflection

External Lab: Extended DynamoDB

Key Terms

Quick start for SQLAlchemy

Explore and analyze data with Python

Lesson Reflection

Recommended Next Steps

Final Quiz-Advanced Data Engineering

Quiz-Building a solution with DynamoDB with the AWS CLI

Quiz-Persisting data through a multi-task DAG with Pandas

Jupyter Sandbox

VS Code Sandbox

Advanced Data Engineering
at
Coursera
Admission Process

Important Dates

May 25, 2024

Course Commencement Date

Other courses offered by Coursera

Databases and SQL for Data Science with Python

IBM - Institute of Business ManagementCertificate

Total Fees

– / –

Duration

3 months

Difficulty level

Beginner

Databases and SQL for Data Science with Python

IBM - Institute of Business ManagementCertificate

Total Fees

– / –

Duration

20 hours

Difficulty level

Beginner

Skills

Python RDBMS

Learn SQL Basics for Data Science Specialization

University of California, DavisCertificate

Total Fees

– / –

Duration

2 months

Difficulty level

Beginner

Skills

Data analysis MySQL Apache

Machine Learning for Marketing Specialization

CourseraCertificate

Total Fees

– / –

Duration

3 months

Difficulty level

Beginner

Skills

Data analysis

View Other 6715 Courses

Advanced Data Engineering

Coursera

Student Forum

Anything you would want to ask experts?

Write here...

Data ScienceData Science BasicsData EngineeringAdvanced Data Engineering

Useful Links

Know more about Coursera

All About Coursera

Courses 2025

Reviews on Placements, Faculty & Facilities

Know more about Programs

Data Science Course, Certification, Degree, Fees, Admission, Career, Syllabus

Data Exploration

Deep Learning and Neural Networks

Advanced Data Engineering

Advanced Data Engineering at Coursera Overview

Advanced Data Engineering at Coursera Highlights

Advanced Data Engineering at Coursera Course details

Advanced Data Engineering at Coursera Curriculum

Advanced Data Engineering at Coursera Admission Process

Important Dates

Other courses offered by Coursera

Databases and SQL for Data Science with Python

Databases and SQL for Data Science with Python

Learn SQL Basics for Data Science Specialization

Machine Learning for Marketing Specialization

Student Forum

Useful Links

Know more about Coursera

Know more about Programs

Advanced Data Engineering
at
Coursera
Overview

Advanced Data Engineering
at
Coursera
Highlights

Advanced Data Engineering
at
Coursera
Course details

Advanced Data Engineering
at
Coursera
Curriculum

Advanced Data Engineering
at
Coursera
Admission Process