Microsoft - Microsoft Azure Databricks for Data Engineering
- Offered byCoursera
Microsoft Azure Databricks for Data Engineering at Coursera Overview
Duration | 22 hours |
Start from | Start Now |
Total fee | Free |
Mode of learning | Online |
Difficulty level | Intermediate |
Official Website | Explore Free Course |
Credential | Certificate |
Microsoft Azure Databricks for Data Engineering at Coursera Highlights
- Flexible deadlines Reset deadlines in accordance to the schedule
- Earn a certificate upon completion from Coursera
Microsoft Azure Databricks for Data Engineering at Coursera Course details
- In this course, learners will learn how to harness the power of Apache Spark and powerful clusters running on the Azure Databricks platform to run large data engineering workloads in the cloud
- Discover the capabilities of Azure Databricks and the Apache Spark notebook for processing huge files
- Learners will come to understand the Azure Databricks platform and identify the types of tasks well-suited for Apache Spark
- Students will also be introduced to the architecture of an Azure Databricks Spark Cluster and Spark Jobs
- They will work with large amounts of data from multiple sources in different raw formats. you will learn how Azure Databricks supports day-to-day data-handling functions, such as reads, writes, and queries
- This course is part of a specialization intended for Data engineers and developers who want to demonstrate their expertise in designing and implementing data solutions that use Microsoft Azure data services for anyone interested in preparing for the Exam DP-203: Data Engineering on Microsoft Azure (beta)
Microsoft Azure Databricks for Data Engineering at Coursera Curriculum
Introduction to Azure Databricks
Introduction to the course
Explain Azure Databricks
Lesson summary
Lesson introduction
Understand the architecture of Azure Databricks Spark cluster
Understand the architecture of spark job
Lesson summary
Course syllabus
How to be successful in this course
Create an Azure Databricks workspace and cluster
Create and execute a notebook
Exercise: Work with Notebooks
Exercise quiz
Knowledge check
Knowledge check
Test prep
Read and write data in Azure Databricks
Lesson introduction
Lesson summary
Read data in CSV format
Read data in JSON format
Read data in Parquet format
Read data stored in tables and views
Write data
Exercises: Read and write data
Exercise quiz
Knowledge check
Test prep
Data processing in Azure Databricks
Lesson introduction
Lesson summary
Lesson introduction
Describe the fundamentals of how the Catalyst Optimizer works
Describe performance enhancements enabled by shuffle operations and Tungsten
Lesson summary
Describe a DataFrame
Use common DataFrame methods
Use the display function
Exercise: Distinct articles
Describe the difference between eager and lazy execution
Define and identify actions and transformations
Exercise quiz
Knowledge check
Knowledge check
Test prep
Work with DataFrames in Azure Databricks
Lesson introduction
Lesson summary
Lesson introduction
Lesson summary
Describe the column class
Work with column expressions
Exercise: Washingtons and Marthas
Perform date and time manipulation
Use aggregate functions
Exercise: Deduplication of data
Exercise quiz
Knowledge check
Exercise quiz
Knowledge check
Test prep
Platform architecture, security, and data protection in Azure Databricks
Lesson introduction
Describe the Azure Databricks platform architecture
Perform data protection
Secure access with Azure IAM and authentication
Describe security
Lesson summary
Create the required resources
Describe Azure key vault and Databricks security scopes
Exercise: Access Azure Storage with key vault-backed secrets
Further resources
Exercise quiz
Knowledge check
Test prep
Delta Lake
Describe the open source Delta Lake
Lesson summary
Lesson introduction
Describe bronze, silver, and gold architecture
Lesson summary
Get started with Delta using Spark APIs
Exercise: Work with basic Delta Lake functionality
Describe how Azure Databricks manages Delta Lake
Exercise: Use the Delta Lake Time Machine and perform optimization
Perform batch and stream processing
Further resources
Exercise quiz
Exercise quiz
Knowledge check
Knowledge check
Test prep
Analyze streaming data and create production workloads
Lesson introduction
Describe Azure Databricks structured streaming
Lesson summary
Lesson introduction
Create the required resources
Summary
Perform stream processing using structured streaming
Work with Time Windows
Process data from Event Hubs with structured streaming
Schedule Databricks jobs in a Data Factory pipeline
Pass parameters into and out of Databricks jobs in Data Factory
Further resources
Knowledge check
Knowledge check
Test prep
Create a data architecture
Lesson introduction
Describe CI/CD
Lesson summary
Lesson summary
Lesson summary
Lesson introduction
Understand workspace administration best practices
List security best practices
Describe tools and integration best practices
Explain Databricks runtime best practices
Lesson summary
Create a CI/CD process with Azure DevOps
Set up Azure Synapse Analytics
Integrate with Azure Synapse Analytics
Understand cluster best practices
Further resources
Knowledge check
Knowledge check
Knowledge check
Test prep
Practice Exam on Data engineering with Azure Databricks
Course recap
Course summary
About the practice exam
Next steps
Course practice exam
Microsoft Azure Databricks for Data Engineering at Coursera Admission Process
Important Dates
Other courses offered by Coursera
Microsoft Azure Databricks for Data Engineering at Coursera Students Ratings & Reviews
- 4-51