AWS - Introduction to Designing Data Lakes in AWS
- Offered byCoursera
Introduction to Designing Data Lakes in AWS at Coursera Overview
Duration | 13 hours |
Start from | Start Now |
Total fee | Free |
Mode of learning | Online |
Difficulty level | Intermediate |
Official Website | Explore Free Course |
Credential | Certificate |
Introduction to Designing Data Lakes in AWS at Coursera Highlights
- Earn a shareable certificate upon completion.
- Flexible deadlines according to your schedule.
Introduction to Designing Data Lakes in AWS at Coursera Course details
- In this class, Introduction to Designing Data Lakes in AWS, we will help you understand how to create and operate a data lake in a secure and scalable way, without previous knowledge of data science! Starting with the "WHY" you may want a data lake, we will look at the Data-Lake value proposition, characteristics and components.
- Designing a data lake is challenging because of the scale and growth of data. Developers need to understand best practices to avoid common mistakes that could be hard to rectify. In this course we will cover the foundations of what a Data Lake is, how to ingest and organize data into the Data Lake, and dive into the data processing that can be done to optimize performance and costs when consuming the data at scale. This course is for professionals (Architects, System Administrators and DevOps) who need to design and build an architecture for secure and scalable Data Lake components. Students will learn about the use cases for a Data Lake and, contrast that with a traditional infrastructure of servers and storage.
Introduction to Designing Data Lakes in AWS at Coursera Curriculum
Week 1
Introduction to Designing Data Lakes in AWS
Meet the Instructors
Introduction to Week 1
Why Data Lakes
Characteristics of Data Lakes
Data Lakes Components
Comparison of a Data Lake to a Data Warehouse
Discussing Sample Data Lake Architectures
Course Welcome and Student Information
Data Lake Characteristics and Components
Data Lakes and Data Warehouses
Week 1 Quiz
Week 2
Introduction to Week 2
AWS Data Lake Related Services
Amazon S3
AWS Glue Data Catalog
AWS Services Used for Data Movement
AWS Services for Data Processing
AWS Services for Analytics
AWS Services for Predictive Analytics and Machine Learning
Introduction to AWS LakeFormation
Amazon S3 and Glue Data Catalog
Data Movement
EMR, Glue Jobs, Lambda, Kinesis Analytics, RedShift
AWS Lake Formation
Week 2 Quiz
Week 3
Introduction to Week 3
Use the Right Tool for the Job
Understanding Data Structure and When To Process Data
Data Streaming Ingestion With Kinesis Services
Batch Data Ingestion with AWS Transfer Family
Batch Data Ingestion with AWS Snow Family
Data Cataloging
Using Glue Crawlers
Reviewing the Ingestion Part in Data Lake Architectures
Diving Deep on Amazon Kinesis
Batch Data Ingestion with AWS Services
The Importance of Data Cataloging
Week 3 Quiz
Week 4
Introduction to Week 4
Data Prep and AWS Glue Jobs
File Optimizations
Using S3, Glue and Athena to Get Insights about NYC Taxi Data
Introduction to Data Lake Security
The Power of Data Visualization
Introduction to Amazon QuickSight
Amazon QuickSight Demo
Registry of Open Data on AWS
Course Wrap Up
Columnar Data Formats and Amazon Athena Optimizations
Security and Compliance
Data visualization, Amazon QuickSight
Registry of Open Data
Week 4 Quiz
Final Assessment