Simplilearn
Simplilearn Logo

Big Data Architect Master's Program 

  • Offered bySimplilearn
  • Private Institute
  • Estd. 2010

Big Data Architect Master's Program
 at 
Simplilearn 
Overview

Mode of learning

Online

Difficulty level

Intermediate

Credential

Certificate

Big Data Architect Master's Program
 at 
Simplilearn 
Highlights

  • Certificate of Completion
  • IIMJobs Pro-Membership for 6 Months
  • 50+ In-Demand Skills & Tools
  • 12+ Real-Life Projects
Read more
Details Icon

Big Data Architect Master's Program
 at 
Simplilearn 
Course details

Who should do this course?
  • Software Developers
  • Testers
  • Software Architects
  • Analytics Professionals
  • Data Management Professionals
  • Data warehouse Professionals
  • Project Managers
  • Mainframe Professionals
  • Graduates looking to build a career in Big Data Hadoop
What are the course deliverables?
  • Simplilearn’s Big Data Hadoop Architect Masters Program will help you master skills and tools like Cassandra Architecture, Data Model Creation, Database Interfaces, Advanced Architecture, Spark, Scala, RDD, SparkSQL, Spark Streaming, Spark ML,GraphX, Replication, Sharding, Scalability, Hadoop clusters, Storm Architecture, Ingestion, Zookeeper and Kafka Architecture
  • The program provides access to high-quality eLearning content, simulation exams, a community moderated by experts, and other resources that ensure you follow the optimal path to your dream role of data scientist
  • Tools Covered - Advanced Analytics Tools, Data Collection and Storage Tools, ETL Tools, File System, Programming Tools
More about this course
  • Simplilearn’s Big Data Hadoop Architect Masters Program will help you master skills and tools like Cassandra Architecture, Data Model Creation, Database Interfaces, Advanced Architecture, Spark, Scala, RDD, SparkSQL, Spark Streaming, Spark ML,GraphX, Replication, Sharding, Scalability, Hadoop clusters, Storm Architecture, Ingestion, Zookeeper and Kafka Architecture. These skills will help you prepare for the role of a Big Data Hadoop architect
  • The program provides access to high-quality eLearning content, simulation exams, a community moderated by experts, and other resources that ensure you follow the optimal path to your dream role of data scientist

Big Data Architect Master's Program
 at 
Simplilearn 
Curriculum

Big Data Hadoop Developer- Online Flexipass

Course Introduction

Why Big Data

What is Big Data

What is Big Data (contd)

Facts about Big Data

Evolution of Big Data

Case Study Netflix and the House of Cards

Market Trends

Course Objectives

Course Details

Project Submission and Certification

On Demand Support

Key Features

Conclusion

Introduction to Big Data and Hadoop

Objectives

Data Explosion

Types of Data

Need for Big Data

Big Data and Its Sources

Characteristics of Big Data

Characteristics of Big Data Technology

Knowledge Check

Leveraging Multiple Data Sources

Traditional IT Analytics Approach

Traditional IT Analytics Approach (contd)

Big Data Technology Platform for Discovery and Exploration

Big Data Technology Platform for Discovery and Exploration (contd)

Big Data Technology Capabilities

Big Data Use Cases

Handling Limitations of Big Data

Introduction to Hadoop

History and Milestones of Hadoop

Organizations Using Hadoop

VMware Player Introduction

VMware Player Hardware Requirements

Oracle VirtualBox to Open a VM

Installing VM using Oracle VirtualBox Demo

Opening a VM using Oracle VirtualBox Demo

Quiz

Summary

Conclusion

Hadoop Architecture

Hadoop Architecture

Objectives

Key Terms

Hadoop Cluster Using Commodity Hardware

Hadoop Configuration

Hadoop Core Services

Apache Hadoop Core Components

Why HDFS

What is HDFS

HDFSReal life Connect

Regular File System vs HDFS

HDFSCharacteristics

HDFSKey Features

HDFS Architecture

NameNode in HA mode

NameNode HA Architecture

HDFS Operation Principle

File System Namespace

NameNode Operation

Data Block Split

Benefits of Data Block Approach

HDFSBlock Replication Architecture

Replication Method

Data Replication Topology

Data Replication Representation

HDFS Access

Business Scenario

Create a new Directory in HDFS Demo

Spot the Error

Quiz

Case Study

Case Study Demo

Summary

Conclusion

Hadoop Deployment

Hadoop Deployment

Objectives

Ubuntu Server Introduction

Installation of Ubuntu Server

Business Scenario

Installing Ubuntu Server Demo

Hadoop Installation Prerequisites

Hadoop Installation

Installing Hadoop Demo

Hadoop Multi Node Installation Prerequisites

Steps for Hadoop Multi Node Installation

Single Node Cluster vs Multi Node Cluster

Creating a Clone of Hadoop VM Demo

Performing Clustering of the Hadoop Environment Demo

Spot the Error

Quiz

Case Study

Case Study Demo

Summary

Conclusion

Introduction to MapReduce

Introduction to YARN and MapReduce

Objectives

Why YARN

What is YARN

YARNReal Life Connect

YARN Infrastructure

YARN Infrastructure (contd)

Resource Manager

Other Resource Manager Components

Resource Manager in HA Mode

Application Master

Node Manager

Container

Applications Running on YARN

Application Startup in YARN

Application Startup in YARN (contd)

Role of Appeaser in Application Startup

Why MapReduce

What is MapReduce

MapReduce Real life Connect

MapReduce Analogy

MapReduce Analogy (contd)

MapReduce Example

Map Execution

Map Execution Distributed Two Node Environment

MapReduce Essentials

MapReduce Jobs

MapReduce and Associated Tasks

Hadoop Job Work Interaction

Characteristics of MapReduce

Real time Uses of MapReduce

Prerequisites for Hadoop Installation in Ubuntu Desktop

Steps to Install Hadoop

Business Scenario

Set up Environment for MapReduce Development

Small Data and Big Data

Uploading Small Data and Big Data

Installing Ubuntu Desktop OS Demo

Build MapReduce Program

Build a MapReduce Program Demo

Hadoop MapReduce Requirements

Steps of Hadoop MapReduce

MapReduce Responsibilities

MapReduce Java Programming in Eclipse

Create a New Project

Checking Hadoop Environment for MapReduce

Build a MapReduce Application using Eclipse and Run in Hadoop Cl Demo

MapReduce v

Spot the Error

Quiz

Case Study

Case Study Demo

Summary

Conclusion

Advanced HDFS and MapReduce

Advanced HDFS and MapReduce

Objectives

Advanced HDFSIntroduction

HDFS Benchmarking

Setting Up HDFS Block Size

Decommissioning a DataNode

Business Scenario

HDFS Demo

Setting HDFS block size in Hadoop Demo

Advanced MapReduce

Interfaces

Data Types in Hadoop

Data Types in Hadoop (contd)

InputFormats in MapReduce

OutputFormats in MapReduce

Distributed Cache

Using Distributed CacheStep

Using Distributed CacheStep

Using Distributed CacheStep

Joins in MapReduce

Reduce Side Join

Reduce Side Join (contd)

Replicated Join

Replicated Join (contd)

Composite Join

Composite Join (contd)

Cartesian Product

Cartesian Product (contd)

MapReduce program for Writable classes Demo

Spot the Error

Quiz

Case Study

Case Study Demo

Summary

Conclusion

Business Analytics Foundation R tools

Business Analytics Foundation With R Tools

Business Analytics Foundation With R Tools

Objectives

Analytics

Places Where Analytics is Applied

Topics Covered

Topics Covered (contd)

Career Path

Thank You

Introduction to Analytics

Introduction to Analytics

analytics vs analysis

What is Analytics

Popular Tools

Role of a Data Scientist

Data Analytics Methodology

Problem Definition

Summarizing Data

Data collection

Data Dictionary

Outlier Treatment

Quiz

Statistical Concepts And Their Application In Business

Statistical Concepts And Their Application In Business

Descriptive Statistics

Probability Theory

Tests of Significance

Non parametric Testing

Quiz

Basic Analytic Techniques Using R

Introduction

Data Exploration

Data Visualization

Pie Charts

Correlation

Analysis of variance

Chi squared test

T test

Summary

Quiz

Predictive Modelling Techniques

Predictive Modelling Techniques

Regression Analysis and Types of regression models

Linear Regression

Coefficient of determination R

How good is the model

How to find Liner regression equation

Commands to perform linear regression

Linear regression to predict sales

Case Study Linear Regression

Case Study Classification

Logistic regression

Example Logistic regression in R

Logistic Regression Predicting recurrent visits to a web site

Cluster Analysis

Command to perform clustering in R

Hierarchical Clustering

Case Study Implement K means and Hierarchical Clustering

Time Series

Cyclical versus seasonal analysis

Decomposition of Time Series

Case Study Time Series Analysis

Decomposing Non Seasonal Time Series

Exponential Smoothing

Advantages and Disadvantages of Exponential Smoothing

Exponential smoothing and forecasting in R

Example Holt Winters

White Noise

Correlogram Analysis

Box Jenkins forecasting Models

Case Study Time Series Data using ARMA

Business Case

Summary

Thank You

Java Essentials for Hadoop

Essentials of Java for Hadoop

Essentials of Java for Hadoop

Objectives

Java Definition

Java Virtual Machine (JVM)

Working of Java

Running a Basic Java Program

Running a Basic Java Program (contd)

Running a Basic Java Program in NetBeans IDE

BASIC JAVA SYNTAX

Data Types in Java

Variables in Java

Naming Conventions of Variables

Type Casting

Operators

Mathematical Operators

Unary Operators

Relational Operators

Logical or Conditional Operators

Bitwise Operators

Static Versus Non Static Variables

Static Versus Non Static Variables (contd)

Statements and Blocks of Code

Flow Control

If Statement

Variants of if Statement

Nested If Statement

Switch Statement

Switch Statement (contd)

Loop Statements

Loop Statements (contd)

Break and Continue Statements

Basic Java Constructs

Arrays

Arrays (contd)

JAVA CLASSES AND METHODS

Classes

Objects

Methods

Access Modifiers

Summary

Thank You

Java Constructors

Java Constructors

Objectives

Features of Java

Classes Objects and Constructors

Constructors

Constructor Overloading

Constructor Overloading (contd)

PACKAGES

Definition of Packages

Advantages of Packages

Naming Conventions of Packages

INHERITANCE

Definition of Inheritance

Multilevel Inheritance

Hierarchical Inheritance

Method Overriding

Method Overriding(contd)

Method Overriding(contd)

ABSTRACT CLASSES

Definition of Abstract Classes

Usage of Abstract Classes

INTERFACES

Features of Interfaces

Syntax for Creating Interfaces

Implementing an Interface

Implementing an Interface(contd)

INPUT AND OUTPUT

Features of Input and Output

Systeminread() Method

Reading Input from the Console

Stream Objects

String Tokenizer Class

Scanner Class

Writing Output to the Console

Summary

Thank You

Essential Classes and Exceptions in Java

Essential Classes and Exceptions in Java

Objectives

The Enums in Java

Program Using Enum

Array List

ArrayList Constructors

Methods of Array List

ArrayList Insertion

ArrayList Insertion (contd)

Iterator

Iterator (contd)

ListIterator

ListIterator (contd)

Displaying Items Using List Iterator

For Each Loop

For Each Loop (contd)

Enumeration

Enumeration (contd)

HASHMAPS

Features of Hashmaps

Hashmap Constructors

Hashmap Methods

Hashmap Insertion

HASHTABLE CLASS

Hashtable Class an Constructors

Hashtable Methods

Hashtable Methods

Hashtable Insertion and Display

Hashtable Insertion and Display (contd)

EXCEPTIONS

Exception Handling

Exception Classes

User Defined Exceptions

Types of Exceptions

Exception Handling Mechanisms

Try Catch Block

Multiple Catch Blocks

Throw Statement

Throw Statement (contd)

User Defined Exceptions

Advantages of Using Exceptions

Error Handling and finally block

Summary

Thank You

Apache Spark and Scala- Online Flexipass

Course Overview

Introduction

Course Objectives

Course Overview

Target Audience

Course Prerequisites

Value to the Professionals

Value to the Professionals (contd)

Value to the Professionals (contd)

s Covered

Conclusion

Introduction to Spark

Introduction

Objectives

Evolution of Distributed Systems

Need of New Generation Distributed Systems

Limitations of MapReduce in Hadoop

Limitations of MapReduce in Hadoop (contd)

Batch vs Real Time Processing

Application of Stream Processing

Application of In Memory Processing

Introduction to Apache Spark

Components of a Spark Project

History of Spark

Language Flexibility in Spark

Spark Execution Architecture

Automatic Parallelization of Complex Flows

Automatic Parallelization of Complex Flows Important Points

APIs That Match User Goals

Apache Spark A Unified Platform of Big Data Apps

More Benefits of Apache Spark

Running Spark in Different Modes

Installing Spark as a Standalone Cluster Configurations

Installing Spark as a Standalone Cluster Configurations

Demo Install Apache Spark

Demo Install Apache Spark

Overview of Spark on a Cluster

Tasks of Spark on a Cluster

Companies Using Spark Use Cases

Hadoop Ecosystem vs Apache Spark

Hadoop Ecosystem vs Apache Spark (contd)

Quiz

Summary

Summary (contd)

Conclusion

Introduction to Programming in Scala

Introduction

Objectives

Introduction to Scala

Features of Scala

Basic Data Types

Basic Literals

Basic Literals (contd)

Basic Literals (contd)

Introduction to Operators

Types of Operators

Use Basic Literals and the Arithmetic Operator

Demo Use Basic Literals and the Arithmetic Operator

Use the Logical Operator

Demo Use the Logical Operator

Introduction to Type Inference

Type Inference for Recursive Methods

Type Inference for Polymorphic Methods and Generic Classes

Unreliability on Type Inference Mechanism

Mutable Collection vs Immutable Collection

Functions

Anonymous Functions

Objects

Classes

Use Type Inference, Functions, Anonymous Function, and Class

Demo Use Type Inference, Functions, Anonymous Function and Class

Traits as Interfaces

Traits Example

Collections

Types of Collections

Types of Collections (contd)

Lists

Perform Operations on Lists

Demo Use Data Structures

Maps

Maps Operations

Pattern Matching

Implicits

Implicits (contd)

Streams

Use Data Structures

Demo Perform Operations on Lists

Quiz

Summary

Summary (contd)

Conclusion

Using RDD for Creating Applications in Spark

Introduction

Objectives

RDDs API

Features of RDDs

Creating RDDs

Creating RDDsReferencing an External Dataset

Referencing an External DatasetText Files

Referencing an External DatasetText Files (contd)

Referencing an External DatasetSequence Files

Referencing an External DatasetOther Hadoop Input Formats

Creating RDDsImportant Points

RDD Operations

RDD Operations Transformations

Features of RDD Persistence

Storage Levels Of RDD Persistence

Choosing The Correct RDD Persistence Storage Level

Invoking the Spark Shell

Importing Spark Classes

Creating the Spark Context

Loading a File in Shell

Performing Some Basic Operations on Files in Spark Shell RDDs

Packaging a Spark Project with SBT

Running a Spark Project With SBT

Demo Build a Scala Project

Build a Scala Project

Demo Build a Spark Java Project

Build a Spark Java Project

Shared Variables Broadcast

Shared Variables Accumulators

Writing a Scala Application

Demo Run a Scala Application

Run a Scala Application

Demo Write a Scala Application Reading the Hadoop Data

Write a Scala Application Reading the Hadoop Data

Demo Run a Scala Application Reading the Hadoop Data

Run a Scala Application Reading the Hadoop Data

Scala RDD Extensions

DoubleRDD Methods

PairRDD MethodsJoin

PairRDD MethodsOthers

Java PairRDD Methods

Java PairRDD Methods (contd)

General RDD Methods

General RDD Methods (contd)

Java RDD Methods

Java RDD Methods (contd)

Common Java RDD Methods

Spark Java Function Classes

Method for Combining JavaPairRDD Functions

Transformations in RDD

Other Methods

Actions in RDD

Key Value Pair RDD in Scala

Key Value Pair RDD in Java

Using MapReduce and Pair RDD Operations

Reading Text File from HDFS

Reading Sequence File from HDFS

Writing Text Data to HDFS

Writing Sequence File to HDFS

Using GroupBy

Using GroupBy (contd)

Demo Run a Scala Application Performing GroupBy Operation

Run a Scala Application Performing GroupBy Operation

Demo Run a Scala Application Using the Scala Shell

Run a Scala Application Using the Scala Shell

Demo Write and Run a Java Application

Write and Run a Java Application

Quiz

Summary

Summary (contd)

Conclusion

Running SQL Queries Using Spark SQL

Introduction

Objectives

Importance of Spark SQL

Benefits of Spark SQL

DataFrames

SQLContext

SQLContext (contd)

Creating a DataFrame

Using DataFrame Operations

Using DataFrame Operations (contd)

Demo Run SparkSQL with a Dataframe

Run SparkSQL with a Dataframe

Interoperating with RDDs

Using the Reflection Based Approach

Using the Reflection Based Approach (contd)

Using the Programmatic Approach

Using the Programmatic Approach (contd)

Demo Run Spark SQL Programmatically

Run Spark SQL Programmatically

Data Sources

Save Modes

Saving to Persistent Tables

Parquet Files

Partition Discovery

Schema Merging

JSON Data

Hive Table

DML Operation Hive Queries

Demo Run Hive Queries Using Spark SQL

Run Hive Queries Using Spark SQL

JDBC to Other Databases

Supported Hive Features

Supported Hive Features (contd)

Supported Hive Data Types

Case Classes

Case Classes (contd)

Quiz

Summary

Summary (contd)

Conclusion

Spark Streaming

Introduction

Objectives

Introduction to Spark Streaming

Working of Spark Streaming

Features of Spark Streaming

Streaming Word Count

Micro Batch

DStreams

DStreams (contd)

Input DStreams and Receivers

Input DStreams and Receivers (contd)

Basic Sources

Advanced Sources

Advanced Sources Twitter

Transformations on DStreams

Transformations on Dstreams (contd)

Output Operations on DStreams

Design Patterns for Using ForeachRDD

DataFrame and SQL Operations

DataFrame and SQL Operations (contd)

Checkpointing

Enabling Checkpointing

Socket Stream

File Stream

Stateful Operations

Window Operations

Types of Window Operations

Types of Window Operations Types (contd)

Join Operations Stream Dataset Joins

Join Operations Stream Stream Joins

Monitoring Spark Streaming Application

Performance Tuning High Level

Performance Tuning Detail Level

Demo Capture and Process the Netcat Data

Capture and Process the Netcat Data

Demo Capture and Process the Flume Data

Capture and Process the Flume Data

Demo Capture the Twitter Data

Capture the Twitter Data

Quiz

Summary

Summary (contd)

Conclusion

MongoDB Developer and Admininistrator- Online Flexipass

Course Introduction

Course Introduction

Table of Contents

Objectives

Course Overview

Value to Professionals and Organizations

Course Prerequisites

s Covered

Conclusion

Introduction to NoSQL databases

NoSQL Database Introduction

Objectives

What is NoSQL?

What is NoSQL?(contd)

Why NoSQL?

Difference Between RDBMS and NoSQL Databases

Benefits of NoSQL

Benefits of NoSQL (contd)

Types of NoSQL

Key Value Database

Key Value Database (contd)

Document Database

Document Database Example

Column Based Database

Column Based Database (contd)

Column Based Database (contd)

Column Based Database Example

Graph Database

Graph Database (contd)

CAP Theorem

CAP Theorem (contd)

Consistency

Availability

Partition Tolerance

Mongo DB as Per CAP

Quiz

Summary

Conclusion

MongoDB A Database for the Modern Web

MongoDB A Database for the Modern Web

Objectives

What is MongoDB?

JSON

JSON Structure

BSON

MongoDB Structure

Document Store Example

MongoDB as a Document Database

Transaction Management in MongoDB

Easy Scaling

Scaling Up vs Scaling Out

Vertical Scaling

Horizontal Scaling

Features of MongoDB

Secondary Indexes

Replication

Replication (contd)

Memory Management

Replica Set

Auto Sharding

Aggregation and MapReduce

Collection and Database

Schema Design and Modeling

Reference Data Model

Reference Data Model Example

Embedded Data Model

Embedded Data Model Example

Data Types

Data Types (contd)

Data Types (contd)

Core Servers of MongoDB

MongoDB's Tools

Installing MongoDB on Linux

Installing MongoDB on Linux

Installing MongoDB on Windows

Installing MongoDB on Windows

Starting MongoDB On Linux

Starting MongoDB On Linux

Starting MongoDB On Windows

Starting MongoDB On Windows

Use Cases

Use Cases (contd)

Quiz

Summary

Conclusion

CRUD Operations in MongoDB

CRUD Operations in MongoDB

Objectives

Data Modification in MongoDB

Batch Insert in MongoDB

Ordered Bulk Insert

Performing Ordered Bulk Insert

Performing Ordered Bulk Insert

Unordered Bulk Insert

Performing Un ordered Bulk Insert

Performing Un ordered Bulk Insert

Inserts Internals and Implications

Performing an Insert Operation

Performing an Insert Operation

Retrieving the documents

Specify Equality Condition

Retrieving Documents by Find Query

Retrieving Documents by Find Query

$in, $or , and AND Conditions

$or Operator

Specify AND/OR Conditions

Retrieving Documents by Using FindOne, AND/OR Conditions

Retrieving Documents by Using FindOne, AND/OR Conditions

Regular Expression

Array Exact Match

Array Projection Operators

Retrieving Documents for Array Fields

Retrieving Documents for Array Fields

$Where Query

Cursor

Cursor (contd)

Cursor (contd)

Retrieving Documents Using Cursor

Retrieving Documents Using Cursor

Pagination

Pagination Avoiding Larger Skips

Advance query option

Update Operation

Updating Documents in MongoDB

Updating Documents in MongoDB

$SET

Updating Embedded Documents in MongoDB

Updating Embedded Documents in MongoDB

Updating Multiple Documents in MongoDB

Updating Multiple Documents in MongoDB

$Unset and $inc Modifiers

$inc modifier to increment and decrement

$inc modifier to increment and decrement

Replacing Existing Document with New Document

Replacing Existing Document with New Document

$Push and $addToSet

Positional Array Modification

Adding Elements into Array Fields

Adding Elements into Array Fields

Adding Elements to Array Fields Using AddToSet

Adding Elements to Array Fields Using AddToSet

Performing AddToSet

Performing AddToSet

Upsert

Removing Documents

Performing Upsert and Remove Operation

Performing Upsert and Remove Operation

Quiz

Summary

Conclusion

Indexing and Aggregation

Indexing and Aggregation

Objectives

Introduction to Indexing

Types of Index

Properties of Index

Single Field Index

Single Field Index on Embedded Document

Compound Indexes

Index Prefixes

Sort Order

Ensure Indexes Fit RAM

Multi Key Indexes

Compound Multi Key Indexes

Hashed Indexes

TTL Indexes

Unique Indexes

Sparse Indexes

DemoCreate Compound, Sparse, and Unique Indexes

Demo Create Compound, Sparse, and Unique Indexes

Text Indexes

DemoCreate Single Field and Text Index

DemoCreate Single Field and Text Index

Text Search

Index Creation

Index Creation (contd)

Index Creation on Replica Set

Remove Indexes

Modify Indexes

DemoDrop and Index from a Collection

DemoDrop and Index from a Collection

Rebuild Indexes

Listing Indexes

DemoRetrieve Indexes for a Collection and Database

DemoRetrieve Indexes for a Collection and Database

Measure Index Use

DemoUse Mongo Shell Methods to Monitor Indexes

DemoUse Mongo Shell Methods to Monitor Indexes

Control Index Use

DemoUse the Explain, $Hint and $Natural Operators to Create Index

DemoUse the Explain, $Hint and $Natural Operators to Create Index

Index Use Reporting

Geospatial Index

DemoCreate Geospatial Index

DemoCreate Geospatial Index

MongoDBs Geospatial Query Operators

DemoUse Geospatial Index in a Query

DemoUse Geospatial Index in a Query

$GeoWith Operator

Proximity Queries in MongoDB

Aggregation

Aggregation (contd)

Pipeline Operators and Indexes

Aggregate Pipeline Stages

Aggregate Pipeline Stages (contd)

Aggregation Example

DemoUse Aggregate Function

DemoUse Aggregate Function

MapReduce

MapReduce (contd)

MapReduce (contd)

DemoUse MapReduce in MongoDB

DemoUse MapReduce in MongoDB

Aggregation Operations

DemoUse Distinct and Count Methods

DemoUse Distinct and Count Methods

Aggregation Operations (contd)

DemoUse the Group Function

DemoUse the Group Function

Quiz

Summary

Conclusion

Replication and Sharding

Replication and Sharding

Objectives

Introduction to Replication

Master Slave Replication

Replica Set in MongoDB

Replica Set in MongoDB (contd)

Automatic Failover

Replica Set Members

Priority Replica Set Members

Hidden Replica Set Members

Delayed Replica Set Members

Delayed Replica Set Members (contd)

DemoStart a Replica Set

Demo Start a Replica Set

Write Concern

Write Concern (contd)

Write Concern Levels

Write Concern for a Replica Set

Modify Default Write Concern

Read Preference

Read Preference Modes

Blocking for Replication

Tag Set

Configure Tag Sets for Replica set

Replica Set Deployment Strategies

Replica Set Deployment Strategies (contd)

Replica Set Deployment Patterns

Oplog File

Replication State and Local Database

Replication Administration

DemoCheck a Replica Set Status

DemoCheck a Replica Set Status

Sharding

When to Use Sharding?

What is a Shard?

What is a Shard Key

Choosing a Shard Key

Ideal Shard Key

Range Based Shard Key

Hash Based Shrading

Impact of Shard Keys on Cluster Operation

Production Cluster Architecture

Config Server Availability

Production Cluster Deployment

Deploy a Sharded Cluster

Add Shards to a Cluster

DemoCreate a Sharded Cluster

DemoCreate a Sharded Cluster

Enable Sharding for Database

Enable Sharding for Collection

Enable Sharding for Collection (contd)

Maintaining a Balanced Data Distribution

Splitting

Chunk Size

Special Chunk Type

Shard Balancing

Shard Balancing (contd)

Customized Data Distribution with Tag Aware Sharding

Tag Aware Sharding

Add Shard Tags

Remove Shard Tags

Quiz

Summary

Conclusion

Apache Cassandra

Course Overview

Course Overview

Course Objectives

Course Overview

Target Audience

Course Prerequisites

Value to the Professionals

s Covered

Conclusion

Apache Cassandra L Overview Big Data and NoSQL Database

Overview of Big Data and NoSQL Database

Course Map

Objectives

The Vs of Big Data

Volume

Data Sizes Terms

Velocity

Variety

Data Evolution

Features of Big Data

Big Data Use Cases

Big Data Analytics

Traditional Technology vs Big Data Technology

Apache Hadoop

HDFS

MapReduce

NoSQL Databases

Brewers CAP Principle

Approaches to NoSQL Databases Types

Quiz

Summary

Conclusion

Introduction to Cassandra

Introduction to Cassandra

Course Map

Objectives

Introducing Cassandra

Behind the Name

History of Cassandra

Main Features of Cassandra

When is Cassandra Used

Simple Cassandra Program

Cassandra Command Line Interface

Advantages of Cassandra

Limitations of Cassandra

VMware

Simplilearn Virtual Machine

PuTTY

WinSCP

Demo Install and Setup VM

Demo Install and Setup VM

Quiz

Summary

Conclusion

Cassandra Architecture

Cassandra Architecture

Course Map

Objectives

Architecture Requirements of Cassandra

Cassandra Architecture

Cassandra Architecture (contd)

Effects of the Architecture

Cassandra Write Process

Rack

Cassandra Read Process

Example of Cassandra Read Process

Data Partitions

Replication in Cassandra

Network Topology

Snitches

Gossip Protocol

Seed Nodes

Configuration

Virtual Nodes

Token Generator

Example of Token Generator

Failure Scenarios Node Failure

Failure Scenarios Disk Failure

Failure Scenarios Rack Failure

Failure Scenarios Data Center Failure

Quiz

Summary

Conclusion

Cassandra Installation and Configuration

Cassandra Installation and Configuration

Course Map

Objectives

Cassandra Versions

Steps to Install and Configure Cassandra on Ubuntu System

Step Operating System Selection

Step Machine Selection

Step Preparing for Installation

Step Setup Repository

Step Install Cassandra

Step Check the Installation

Step Configuring Cassandra

Step Configuration for a Single Node Cluster

Step Configuration for a Multi Node and Multi Datacenter Clusters

Step Setup Property File

Step Configuration for a Production Cluster

Step Setup Gossiping Property File

Step Starting Cassandra Services

Step Connecting to Cassandra

Installing on CentOS

Demo Installing and Configuring Cassandra on Ubuntu

Demo Installing and Configuring Cassandra on Ubuntu

Quiz

Summary

Conclusion

Cassandra Data Model

Cassandra Data Model

Course Map

Objectives

Cassandra Data Model

Cassandra Data Model Components

Keyspaces

Tables

Columns

UUID and TimeUUID

Counter

Compound Keys

Indexes

Collection Columns

Collection Columns Set

DDL and DML Statements

DDL Statements

DML Statements INSERT

DML Statements UPDATE

DML Statements COPY

DML Statements SELECT

SELECT Statements Restrictions

Valid and Invalid SELECT Statements Example

DML Statements DELETE

Demo Data Definition and Data Manipulation Statements

Demo Data Definition and Data Manipulation Statements

Demo Create a Table with Composite Key

Demo Create a Table with Composite Key

Demo Collection Columns in Cassandra

Demo Collection Columns in Cassandra

Quiz

Summary

Conclusion

Apache Storm

Introduction

Introduction

Course Objectives

Course Overview

Target Audience

Prerequisites for the Course

s Covered

Conclusion

Big Data Overview

Big Data Overview

Objectives

Big Data

Vs of Big Data

Data Volume

Data Sizes

Velocity of Data

Variety of Data

Data Evolution

Features of Big data

Industry Examples

Big data Analysis

Technology Comparison

Apache Hadoop

HDFS

MapReduce

Real Time Big data

Real Time Big data Examples

Real Time Big data Tools

Zookeeper

Quiz

Summary

Conclusion

Introduction to Storm

Introduction to Storm

Objectives

Apache Storm

Uses of Storm

What is a Stream

Industry use cases for STORM

STORM Data Model

Storm Architecture

Storm Processes

Sample Program

Storm Components

Storm Spout

Storm Bolt

Storm Topology

Storm Example

Serialization Deserialization

Submitting a Job to Storm

Types of Topologies

Installing Ubuntu VM and connecting with Putty Demo

Quiz

Summary

Conclusion

Installation and Configuration

Installation and Configuration

Objectives

Storm Versions

OS selection

Machine Selection

Preparing for Installation

Download Kafka

Download Storm

Install Kafka Demo

Install Storm Demo

Setting Up Multi node Storm Cluster

Quiz

Summary

Conclusion

Storm Advanced Concepts

Storm Advanced Concepts

Objectives

Types of Spouts

Structure of Spout

Structure of Bolt

Stream Groupings

Reliable Processing in Storm

Ack and Fail

Ack Timeout

Anchoring

Topology Lifecycle

Data Ingestion in Storm

Data Ingestion in Storm Example

Data Ingestion in Storm Check Output

Screen Shots for Real Time Data Ingestion

Spout Definition

Bolt Definition

Topology Connecting Spout and Bolt

Wrapper Class

Quiz

Summary

Conclusion

Storm Interfaces

Storm Interfaces

Objectives

Storm Interfaces

Java Interface to Storm

Compile and run a Java interface to Storm Demo

Spout Interface

IRichSpout Methods

BaseRichSpout Methods

OutputFieldsDeclarer Interface

Spout Definition Example

Bolt Interface

Irichbolt Methods

Baserichbolt Methods

Ibasicbolt Methods

Bolt Interface Example

Bolt Interface Example

Topology Interface

Topology Builder Methods

Bolt Declarer Methods

Storm Submitter Methods

Topology Builder Example

Apache Kafka Recap

Kafka Data Model

Apache Cassandra Recap

Real Time Data Analysis Platform

Kafka Interface to Storm

Kafka Spout

Kafka Spout Configuration

Kafka Spout Schemes

Using Kafka Spout in Storm

Storm Interface to Cassandra

Insert or Update Cassandra

Setting Up Cassandra Session

Insert or Update Data into Cassandra from Bolt

Kafka Storm Cassandra

Quiz

Summary

Conclusion

Java Essential for Hadoop

Essentials of Java for Hadoop

Essentials of Java for Hadoop

Objectives

Java Definition

Java Virtual Machine (JVM)

Working of Java

Running a Basic Java Program

Running a Basic Java Program (contd)

Running a Basic Java Program in NetBeans IDE

BASIC JAVA SYNTAX

Data Types in Java

Variables in Java

Naming Conventions of Variables

Type Casting

Operators

Mathematical Operators

Unary Operators

Relational Operators

Logical or Conditional Operators

Bitwise Operators

Static Versus Non Static Variables

Static Versus Non Static Variables (contd)

Statements and Blocks of Code

Flow Control

If Statement

Variants of if Statement

Nested If Statement

Switch Statement

Switch Statement (contd)

Loop Statements

Loop Statements (contd)

Break and Continue Statements

Basic Java Constructs

Arrays

Arrays (contd)

JAVA CLASSES AND METHODS

Classes

Objects

Methods

Access Modifiers

Summary

Thank You

Java Constructors

Java Constructors

Objectives

Features of Java

Classes Objects and Constructors

Constructors

Constructor Overloading

Constructor Overloading (contd)

PACKAGES

Definition of Packages

Advantages of Packages

Naming Conventions of Packages

INHERITANCE

Definition of Inheritance

Multilevel Inheritance

Hierarchical Inheritance

Method Overriding

Method Overriding(contd)

Method Overriding(contd)

ABSTRACT CLASSES

Definition of Abstract Classes

Usage of Abstract Classes

INTERFACES

Features of Interfaces

Syntax for Creating Interfaces

Implementing an Interface

Implementing an Interface(contd)

INPUT AND OUTPUT

Features of Input and Output

Systeminread() Method

Reading Input from the Console

Stream Objects

String Tokenizer Class

Scanner Class

Writing Output to the Console

Summary

Thank You

Essential Classes and Exceptions in Java

Essential Classes and Exceptions in Java

Objectives

The Enums in Java

Program Using Enum

Array List

Array List Constructors

Methods of Array List

Array List Insertion

Array List Insertion (contd)

Iterator

Iterator (contd)

List Iterator

List Iterator (contd)

Displaying Items Using List Iterator

For Each Loop

For Each Loop (contd)

Enumeration

Enumeration (contd)

HASHMAPS

Features of Hash maps

Hash map Constructors

Hash map Methods

Hash map Insertion

HASHTABLE CLASS

Hash table Class an Constructors

Hash table Methods

Hash table Methods

Hash table Insertion and Display

Hash table Insertion and Display (contd)

EXCEPTIONS

Exception Handling

Exception Classes

User Defined Exceptions

Types of Exceptions

Exception Handling Mechanisms

Try Catch Block

Multiple Catch Blocks

Throw Statement

Throw Statement (contd)

User Defined Exceptions

Advantages of Using Exceptions

Error Handling and finally block

Summary

Thank You

Apache Kafka

Course Introduction

Course Objectives

Course Overview

Target Audience

Prerequisites

s Covered

Conclusion

Big Data Overview

Big Data Overview

Objectives

Big Data Introduction

The Three Vs of Big Data

Data Volume

Data Sizes

Data Velocity

Data Variety

Data Evolution

Features of Big data

Industry Examples

Big Data Analysis

Technology Comparison

Stream

Apache Hadoop

Hadoop Distributed File System

MapReduce

Real Time Big Data Tools

Apache Kafka

Apache Storm

Apache Spark

Apache Cassandra

Apache Base

Real Time Big Data ToolsUses

Real Time Big Data Use Cases

Quiz

Summary

Conclusion

Introduction to Zookeeper

Introduction to ZooKeeper

Objectives

Zookeeper Introduction

Distributed Applications

Challenges of Distributed Applications

Partial Failures

Race Conditions

Deadlocks

Inconsistencies

ZooKeeper Characteristics

ZooKeeper Data Model

Types of Znodes

Sequential Znodes

VMware

Simplilearn Virtual Machine

PuTTY

WinSCP

Demo Install and Setup VM

Demo Install and Setup VM

ZooKeeper Installation

ZooKeeper Configuration

ZooKeeper Command Line Interface

ZooKeeper Command Line Interface Commands

ZooKeeper Client APIs

ZooKeeper Recipe Handling Partial Failures

ZooKeeper Recipe Leader Election

Quiz

Summary

Conclusion

Introduction to Kafka

Introduction to Kafka

Objectives

Apache Kafka Introduction

Kafka History

Kafka Use Cases

Aggregating User Activity Using Kafka Example

Kafka Data Model

Topics

Partitions

Partition Distribution

Producers

Consumers

Kafka Architecture

Types of Messaging Systems

Queue System Example

Publish Subscribe System Example

Brokers

Kafka Guarantees

Kafka at LinkedIn

Replication in Kafka

Persistence in Kafka

Quiz

Summary

Conclusion

Installation and Configuration

Installation and Configuration

Objectives

Kafka Versions

OS Selection

Machine Selection

Preparing for Installation

Demo Kafka Installation and Configuration

Demo Kafka Installation and Configuration

Demo Creating and Sending Messages

Demo Creating and Sending Messages

Stop the Kafka Server

Setting up Multi Node Kafka Cluster Step

Setting up Multi Node Kafka Cluster Step

Setting up Multi Node Kafka Cluster Step

Setting up Multi Node Kafka Cluster Step

Setting up Multi Node Kafka Cluster Step

Setting up Multi Node Kafka Cluster Step

Quiz

Summary

Conclusion

Kafka Interfaces

Kafka Interfaces

Objectives

Kafka Interfaces Introduction

Creating a Topic

Modifying a Topic

kafka topicssh Options

Creating a Message

kafka console producersh Options

Creating a Message Example

Creating a Message Example

Reading a Message

kafka console consumersh Options

Reading a MessageExample

Java Interface to Kafka

Producer Side API

Producer Side API Example Step

Producer Side API Example Step

Producer Side API Example Step

Producer Side API Example Step

Producer Side API Example Step

Consumer Side API

Consumer Side API Example Step

Consumer Side API Example Step

Consumer Side API Example Step

Consumer Side API Example Step

Consumer Side API Example Step

Compiling a Java Program

Running the Java Program

Java Interface Observations

Exercise Tasks

Exercise Tasks (contd)

Exercise Solutions

Exercise Solutions (contd)

Exercise Solutions (contd)

Exercise Tasks

Exercise Tasks (contd)

Exercise Solutions

Exercise Solutions (contd)

Exercise Solutions (contd)

Exercise Solutions (contd)

Exercise Solutions (contd)

Quiz

Summary

Thank You

Impala Training

Course Introduction

Course Introduction

Table of Contents

Objectives

Course Overview

Value to Professionals and Organizations

Course Prerequisites

Conclusion

Introduction to Impala

Introduction to Impala

Objectives

What is Impala

Benefits of Impala

Benefits of Impala (Contd)

Exploratory Business Intelligence

Impala Installation

Demo Using Cloudera Manager for Impala

Demo Using Cloudera Manager for Impala (contd)

Starting and Stopping Impala

Demo Starting Impala from Command Line

Demo Starting Impala from Command Line (contd)

Data Storage

Managing Metadata

Controlling Access to Data

Impala Shell Commands and Interface

Impala Shell Commands and Interface (contd)

Demo Launching Impala Shell and Shell Command

Demo Launching Impala Shell and Shell Command (contd)

Quiz

Summary

Summary (contd)

Conclusion

Querying with Hive and Impala

Querying with Hive and Impala

Objectives

SQL Language Statements

DDL Statements

DML Statements

CREATE DATABASE

CREATE TABLE

CREATE TABLE Examples

Internal and External Tables

Loading Data into Impala Table

ALTER TABLE

DROP TABLE

DROP DATABASE

DESCRIBE Statement

EXPLAIN Statement

SHOW TABLE Statement

INSERT Statement

INSERT Statement Examples

SELECT Statement

Data Type

Data Type (contd)

Operators

Functions

CREATE VIEW in Impala

Hive and Impala Query Syntax

Demo Using Impala Shell for DDL and DDML SQL Statements

Demo Using Impala Shell for DDL and DDML SQL Statements (contd)

Quiz

Summary

Conclusion

Data Storage and File Format

Data Storage and File Format

Objectives

Partitioning Tables

SQL Statements for Partitioned Tables

File Format and Performance Considerations

Choosing File Type and Compression Technique

Demo File Formats and Compression Techniques

Demo File Formats and Compression Techniques (contd)

Quiz

Summary

Conclusion

Working with Impala

Working with Impala

Objectives

Impala Architecture

Impala Daemon

Impala Statestore

Impala Catalog Service

Query Execution Flow in Impala

User Defined Functions

Hive UDFs with Impala

Demo UDF in Impala

Demo UDF in Impala(contd)

Improving Impala Performance

Quiz

Summary

Conclusion

Faculty Icon

Big Data Architect Master's Program
 at 
Simplilearn 
Faculty details

Ronald van Loon
Named by Onalytica as one of the three most influential people in Big Data, Ronald is also an author for a number of leading Big Data and Data Science websites, including Datafloq, Data Science Central, and The Guardian. He also regularly speaks at renowned events.
Sina Jamshidi
Sina has over 10 years of experience in the technology field as a Big Data Architect at Bell Labs and a Platinum level trainer. Sina is a very passionate about building a Big Data education ecosystem and has been a contributor in a number of public and journal publications.

Other courses offered by Simplilearn

– / –
6 months
– / –
1.53 L
11 months
– / –
– / –
4 days
– / –
1.5 L
4 months
– / –
View Other 283 CoursesRight Arrow Icon

Big Data Architect Master's Program
 at 
Simplilearn 
Students Ratings & Reviews

5/5
Verified Icon2 Ratings
S
Soumita Deb
Big Data Architect Master's Program
Offered by Simplilearn
5
Great Content, Good Course
Other: I enrolled in Big Data Hadoop Master Program and the experience so far has been very satisfying because the course content is present in a crisp and concise manner. The instructors have up to date knowledge of the subject and teach concepts effectively. The hands-on experience obtained due to cloud labs is one of the many commendable aspects which Simplilean offers. I also plan to take up the Data Scientist Masters’ Program. Thanks, Simplilearn for all your support!
Reviewed on 13 Jun 2019Read More
Thumbs Up IconThumbs Down Icon
N
Nairita Banerjee
Big Data Architect Master's Program
Offered by Simplilearn
5
Well- Structured, excellent platform to get hands-on experience
Other: I enrolled in the Big Data Hadoop Architect Program at Simplilearn. The sessions are comprehensive, well-planned and the reading material provided is exceptionally good. The trainers are marvelous as they explain every concept in detail with ease. The Cloudlab feature provided is an excellent platform for anyone looking to get hands-on experience as you can access it 24*7. Keep the good work going, Simplilearn!
Reviewed on 13 Jun 2019Read More
Thumbs Up IconThumbs Down Icon
View All 2 ReviewsRight Arrow Icon
qna

Big Data Architect Master's Program
 at 
Simplilearn 

Student Forum

chatAnything you would want to ask experts?
Write here...