Coursera
Coursera Logo

Databricks - Apache Spark (TM) SQL for Data Analysts 

  • Offered byCoursera

Apache Spark (TM) SQL for Data Analysts
 at 
Coursera 
Overview

Duration

14 hours

Start from

Start Now

Total fee

Free

Mode of learning

Online

Difficulty level

Intermediate

Official Website

Explore Free Course External Link Icon

Credential

Certificate

Apache Spark (TM) SQL for Data Analysts
 at 
Coursera 
Highlights

  • Earn a shareable certificate upon completion.
  • Flexible deadlines according to your schedule.
Details Icon

Apache Spark (TM) SQL for Data Analysts
 at 
Coursera 
Course details

More about this course
  • Apache Spark is one of the most widely used technologies in big data analytics. In this course, you will learn how to leverage your existing SQL skills to start working with Spark immediately. You will also learn how to work with Delta Lake, a highly performant, open-source storage layer that brings reliability to data lakes. By the end of this course, you will be able to use Spark SQL and Delta Lake to ingest, transform, and query data to extract valuable insights that can be shared with your team.

Apache Spark (TM) SQL for Data Analysts
 at 
Coursera 
Curriculum

Welcome to Apache Spark SQL for Data Analysts

Course goals

Before you begin

End of module knowledge check

Spark makes big data easy

Introduction to module 2

What is big data?

Common struggles with big data

Big Data Needs

Apache Spark Intro

Spark SQL

Module 2 Concept Review

Using Spark SQL on Databricks

Introduction to Module 3

Signing up for Databricks Community Edition

Preparing your workspace

Working with notebooks

Using course materials

Basic queries with Spark SQL reading introduction

Data Visualization on Databricks reading introduction

Data visualization tools

Exploratory Data Analysis lab introduction

Course Materials

Basic Queries reading activity

Data Visualization reading activity

Your turn! Exploratory Data Analysis lab

Module 3 Concept Review

3.3 Exploratory Data Analysis Quiz

Spark Under the Hood

Introduction to module 4

Understanding optimizations

The physical cluster

The SparkUI and SQL tab

Optimizing query logic

Impact of Caching

Optimizing with selective data loading

Module 4 Concept Review

Complex Queries

Introduction to module 5

What is nested data?

Introduction to managing nested data

Introduction to Manipulating Data

Introduction to Data Munging

Managing Nested Data reading activity

Manipulating Data reading activity

5.3 Data Munging Lab

Module 5 Concept Review

Lab 5.3 Quiz

Applied Spark SQL

Introduction to module 6

Complex data - common strategies

About higher-order functions

Higher-order functions introduction

Introducing Aggregating and Summarizing Data

Partitioning Tables Introduction

Sharing Insights Lab Introduction

Higher Order Functions reading activity

Aggregating and Summarizing Data reading activity

Partitioning Tables

Sharing Insights

Module 6 concept review

Lab 6.4 Quiz

Data Storage and Optimization

Introduction to module 7

A quick refresher

Introducing a new data management paradigm

Introduction to the lesson

What is Delta Lake

Data Warehouses

Data Lakes

Data Lakes vs Data Warehouses

The Lakehouse

Delta Lake with Spark SQL

Introduction to the module

Intro to Using Delta reading

Managing Records in a Delta table

Delta Engine Optimization Introduction

Delta Lake Lab Introduction

8.1 Using Delta

8.2 Managing records

8.3 Optimizing Delta

Delta Lab

8.4 Delta Lab

SQL Coding Challenges

SQL coding challenges

Final Exam

Apache Spark (TM) SQL for Data Analysts
 at 
Coursera 
Admission Process

    Important Dates

    May 25, 2024
    Course Commencement Date

    Other courses offered by Coursera

    – / –
    3 months
    Beginner
    – / –
    20 hours
    Beginner
    – / –
    2 months
    Beginner
    – / –
    3 months
    Beginner
    View Other 6715 CoursesRight Arrow Icon

    Apache Spark (TM) SQL for Data Analysts
     at 
    Coursera 
    Students Ratings & Reviews

    5/5
    Verified Icon1 Rating
    R
    RUSHIKESH J
    Apache Spark (TM) SQL for Data Analysts
    Offered by Coursera
    5
    Learning Experience: Data Analysis, Data Bricks, Transformation, query optimistic
    Faculty: Data Bricks team, Awesome Yes, content is good
    Course Support: No career support provided
    Reviewed on 22 Mar 2022Read More
    Thumbs Up IconThumbs Down Icon
    View 1 ReviewRight Arrow Icon
    qna

    Apache Spark (TM) SQL for Data Analysts
     at 
    Coursera 

    Student Forum

    chatAnything you would want to ask experts?
    Write here...