Big Data and Hadoop Spark Developer Training

4.5 /5

(2 Ratings)

Offered bySimplilearn
Private Institute
Estd. 2010

Big Data and Hadoop Spark Developer Training
at
Simplilearn
Overview

Duration	35 hours
Mode of learning	Online
Difficulty level	Intermediate
Credential	Certificate
Future job roles	CRUD, .Net, CSR, Credit risk, Senior Software Developer

Big Data and Hadoop Spark Developer Training
at
Simplilearn
Highlights

Audio-video Lectures along with Chapter-level Quizzes
Aligned to Cloudera CCA175 certification exam
A great course for learning Big Data
Certification Course

Big Data and Hadoop Spark Developer Training
at
Simplilearn
Course details

More about this course

This Big Data Hadoop Certification course is designed to give you an in-depth knowledge of the big data framework using Hadoop and Spark
In this hands-on big data course, students will execute real-life, industry-based projects using Simplilearn's integrated labs

Big Data and Hadoop Spark Developer Training
at
Simplilearn
Curriculum

Lesson 1 Course Introduction

Course Introduction

Accessing Practice Lab

Lesson 2 Introduction to Big Data and Hadoop

Introduction to Big Data and Hadoop

Introduction to Big Data

Big Data Analytics

What is Big Data

Four Vs Of Big Data

Case Study: Royal Bank of Scotland

Challenges of Traditional System

Distributed Systems

Introduction to Hadoop

Components of Hadoop Ecosystem: Part One

Components of Hadoop Ecosystem: Part Two

Components of Hadoop Ecosystem: Part Three

Commercial Hadoop Distributions

Demo: Walkthrough of Simplilearn Cloudlab

Key Takeaways

Knowledge Check

Lesson 3 Hadoop Architecture,Distributed Storage (HDFS) and YARN

Hadoop Architecture Distributed Storage (HDFS) and YARN

What Is HDFS

Need for HDFS

Regular File System vs HDFS

Characteristics of HDFS

HDFS Architecture and Components

High Availability Cluster Implementations

HDFS Component File System Namespace

Data Block Split

Data Replication Topology

HDFS Command Line

Demo: Common HDFS Commands

HDFS Command Line

YARN Introduction

YARN Use Case

YARN and Its Architecture

Resource Manager

How Resource Manager Operates

Application Master03:29

How YARN Runs an Application

Tools for YARN Developers

Demo: Walkthrough of Cluster Part One

Demo: Walkthrough of Cluster Part Two

Key Takeaways

Knowledge Check

Hadoop Architecture, Distributed Storage (HDFS) and YARN

Lesson 4 Data Ingestion into Big Data Systems and ETL

Data Ingestion into Big Data Systems and ETL

Data Ingestion Overview Part One

Data Ingestion

Apache Sqoop

Sqoop and Its Uses

Sqoop Processing

Sqoop Import Process

Assisted Practice: Import into Sqoop

Sqoop Connectors

Demo: Importing and Exporting Data from MySQL to HDFS

Apache Sqoop

Apache Flume

Flume Model

Scalability in Flume

Components in Flume's Architecture

Configuring Flume Components

Demo: Ingest Twitter Data

Apache Kafka

Aggregating User Activity Using Kafka

Kafka Data Model

Partitions

Apache Kafka Architecture

Producer Side API Example

Consumer Side API

Demo: Setup Kafka Cluster

Consumer Side API Example

Kafka Connect

Key Takeaways

Demo: Creating Sample Kafka Data Pipeline using Producer and Consumer

Knowledge Check

Data Ingestion into Big Data Systems and ETL

Lesson 5 Distributed Processing - MapReduce Framework and Pig

Distributed Processing MapReduce Framework and Pig

Distributed Processing in MapReduce

Word Count Example

Map Execution Phases

Map Execution Distributed Two Node Environment

MapReduce Jobs

Hadoop MapReduce Job Work Interaction

Setting Up the Environment for MapReduce Development

Set of Classes

Creating a New Project

Advanced MapReduce

Data Types in Hadoop

OutputFormats in MapReduce

Using Distributed Cache

Joins in MapReduce

Replicated Join

Introduction to Pig

Components of Pig

Pig Data Model

Pig Interactive Modes

Pig Operations

Various Relations Performed by Developers

Demo: Analyzing Web Log Data Using MapReduce

Demo: Analyzing Sales Data and Solving KPIs using PIG

Apache Pig

Demo: Wordcount

Key takeaways

Knowledge Check

Distributed Processing - MapReduce Framework and Pig

Lesson 6 Apache Hive

Apache Hive

Hive SQL over Hadoop MapReduce

Hive Architecture

Interfaces to Run Hive Queries

Running Beeline from Command Line

Hive Metastore

Hive DDL and DML

Creating New Table

Data Types

Validation of Data

File Format Types

Data Serialization

Hive Table and Avro Schema

Hive Optimization Partitioning Bucketing and Sampling

Non Partitioned Table

Data Insertion

Dynamic Partitioning in Hive

Bucketing

What Do Buckets Do

Hive Analytics UDF and UDAF

Assisted Practice: Synchronization

Other Functions of Hive

Demo: Real-Time Analysis and Data Filteration

Demo: Real-World Problem

Demo: Data Representation and Import using Hive

Key Takeaways

Knowledge Check

Apache Hive

Other courses offered by Simplilearn

Data Analyst

SimplilearnCertificate

Total Fees

– / –

Duration

6 months

Difficulty level

– / –

Professional Certificate Course In Generative AI And Machine Learning

IIT KanpurCertificate

Total Fees

₹1.53 L

Duration

11 months

Difficulty level

– / –

Skills

Python Django Statistics

Cyber Security Expert

SimplilearnCertificate

Total Fees

– / –

Duration

4 days

Difficulty level

– / –

Applied Generative AI Specialization

Purdue UniversityCertificate

Total Fees

₹1.5 L

Duration

4 months

Difficulty level

– / –

Skills

Python Risk Management

View Other 312 Courses

Big Data and Hadoop Spark Developer Training
at
Simplilearn
Students Ratings & Reviews

4.5/5

2 Ratings

4-5
1
3-4
1

Amrita Rohatgi

Big Data and Hadoop Spark Developer Training

Offered by Simplilearn

Learning Experience: It was awesome learning platform and I received appreciation certificate for Big Data engineer

Faculty: It was good They had multiple test, assignment and lastly had exam for it

Reviewed on 1 Mar 2023Read More

Shabarinath gupta kokkonda

Big Data and Hadoop Spark Developer Training

Offered by Simplilearn

Learning Experience: It gave very high level info about bigdata Hadoop hdfs spark Scala stream processing and mapreduve and many other architectures

Faculty: Instructors taught well It was crear and no confusing with each and every topics explained clearly

Course Support: Yes, but only in paid version

Reviewed on 9 Jul 2022Read More

View All 2 Reviews