Big Data Architect Master's Program
- Offered bySimplilearn
- Private Institute
- Estd. 2010
Big Data Architect Master's Program at Simplilearn Overview
Mode of learning | Online |
Difficulty level | Intermediate |
Credential | Certificate |
Big Data Architect Master's Program at Simplilearn Highlights
- Certificate of Completion
- IIMJobs Pro-Membership for 6 Months
- 50+ In-Demand Skills & Tools
- 12+ Real-Life Projects
Big Data Architect Master's Program at Simplilearn Course details
- Software Developers
- Testers
- Software Architects
- Analytics Professionals
- Data Management Professionals
- Data warehouse Professionals
- Project Managers
- Mainframe Professionals
- Graduates looking to build a career in Big Data Hadoop
- Simplilearn’s Big Data Hadoop Architect Masters Program will help you master skills and tools like Cassandra Architecture, Data Model Creation, Database Interfaces, Advanced Architecture, Spark, Scala, RDD, SparkSQL, Spark Streaming, Spark ML,GraphX, Replication, Sharding, Scalability, Hadoop clusters, Storm Architecture, Ingestion, Zookeeper and Kafka Architecture
- The program provides access to high-quality eLearning content, simulation exams, a community moderated by experts, and other resources that ensure you follow the optimal path to your dream role of data scientist
- Tools Covered - Advanced Analytics Tools, Data Collection and Storage Tools, ETL Tools, File System, Programming Tools
- Simplilearn’s Big Data Hadoop Architect Masters Program will help you master skills and tools like Cassandra Architecture, Data Model Creation, Database Interfaces, Advanced Architecture, Spark, Scala, RDD, SparkSQL, Spark Streaming, Spark ML,GraphX, Replication, Sharding, Scalability, Hadoop clusters, Storm Architecture, Ingestion, Zookeeper and Kafka Architecture. These skills will help you prepare for the role of a Big Data Hadoop architect
- The program provides access to high-quality eLearning content, simulation exams, a community moderated by experts, and other resources that ensure you follow the optimal path to your dream role of data scientist
Big Data Architect Master's Program at Simplilearn Curriculum
Big Data Hadoop Developer- Online Flexipass
Course Introduction
Why Big Data
What is Big Data
What is Big Data (contd)
Facts about Big Data
Evolution of Big Data
Case Study Netflix and the House of Cards
Market Trends
Course Objectives
Course Details
Project Submission and Certification
On Demand Support
Key Features
Conclusion
Introduction to Big Data and Hadoop
Objectives
Data Explosion
Types of Data
Need for Big Data
Big Data and Its Sources
Characteristics of Big Data
Characteristics of Big Data Technology
Knowledge Check
Leveraging Multiple Data Sources
Traditional IT Analytics Approach
Traditional IT Analytics Approach (contd)
Big Data Technology Platform for Discovery and Exploration
Big Data Technology Platform for Discovery and Exploration (contd)
Big Data Technology Capabilities
Big Data Use Cases
Handling Limitations of Big Data
Introduction to Hadoop
History and Milestones of Hadoop
Organizations Using Hadoop
VMware Player Introduction
VMware Player Hardware Requirements
Oracle VirtualBox to Open a VM
Installing VM using Oracle VirtualBox Demo
Opening a VM using Oracle VirtualBox Demo
Quiz
Summary
Conclusion
Hadoop Architecture
Hadoop Architecture
Objectives
Key Terms
Hadoop Cluster Using Commodity Hardware
Hadoop Configuration
Hadoop Core Services
Apache Hadoop Core Components
Why HDFS
What is HDFS
HDFSReal life Connect
Regular File System vs HDFS
HDFSCharacteristics
HDFSKey Features
HDFS Architecture
NameNode in HA mode
NameNode HA Architecture
HDFS Operation Principle
File System Namespace
NameNode Operation
Data Block Split
Benefits of Data Block Approach
HDFSBlock Replication Architecture
Replication Method
Data Replication Topology
Data Replication Representation
HDFS Access
Business Scenario
Create a new Directory in HDFS Demo
Spot the Error
Quiz
Case Study
Case Study Demo
Summary
Conclusion
Hadoop Deployment
Hadoop Deployment
Objectives
Ubuntu Server Introduction
Installation of Ubuntu Server
Business Scenario
Installing Ubuntu Server Demo
Hadoop Installation Prerequisites
Hadoop Installation
Installing Hadoop Demo
Hadoop Multi Node Installation Prerequisites
Steps for Hadoop Multi Node Installation
Single Node Cluster vs Multi Node Cluster
Creating a Clone of Hadoop VM Demo
Performing Clustering of the Hadoop Environment Demo
Spot the Error
Quiz
Case Study
Case Study Demo
Summary
Conclusion
Introduction to MapReduce
Introduction to YARN and MapReduce
Objectives
Why YARN
What is YARN
YARNReal Life Connect
YARN Infrastructure
YARN Infrastructure (contd)
Resource Manager
Other Resource Manager Components
Resource Manager in HA Mode
Application Master
Node Manager
Container
Applications Running on YARN
Application Startup in YARN
Application Startup in YARN (contd)
Role of Appeaser in Application Startup
Why MapReduce
What is MapReduce
MapReduce Real life Connect
MapReduce Analogy
MapReduce Analogy (contd)
MapReduce Example
Map Execution
Map Execution Distributed Two Node Environment
MapReduce Essentials
MapReduce Jobs
MapReduce and Associated Tasks
Hadoop Job Work Interaction
Characteristics of MapReduce
Real time Uses of MapReduce
Prerequisites for Hadoop Installation in Ubuntu Desktop
Steps to Install Hadoop
Business Scenario
Set up Environment for MapReduce Development
Small Data and Big Data
Uploading Small Data and Big Data
Installing Ubuntu Desktop OS Demo
Build MapReduce Program
Build a MapReduce Program Demo
Hadoop MapReduce Requirements
Steps of Hadoop MapReduce
MapReduce Responsibilities
MapReduce Java Programming in Eclipse
Create a New Project
Checking Hadoop Environment for MapReduce
Build a MapReduce Application using Eclipse and Run in Hadoop Cl Demo
MapReduce v
Spot the Error
Quiz
Case Study
Case Study Demo
Summary
Conclusion
Advanced HDFS and MapReduce
Advanced HDFS and MapReduce
Objectives
Advanced HDFSIntroduction
HDFS Benchmarking
Setting Up HDFS Block Size
Decommissioning a DataNode
Business Scenario
HDFS Demo
Setting HDFS block size in Hadoop Demo
Advanced MapReduce
Interfaces
Data Types in Hadoop
Data Types in Hadoop (contd)
InputFormats in MapReduce
OutputFormats in MapReduce
Distributed Cache
Using Distributed CacheStep
Using Distributed CacheStep
Using Distributed CacheStep
Joins in MapReduce
Reduce Side Join
Reduce Side Join (contd)
Replicated Join
Replicated Join (contd)
Composite Join
Composite Join (contd)
Cartesian Product
Cartesian Product (contd)
MapReduce program for Writable classes Demo
Spot the Error
Quiz
Case Study
Case Study Demo
Summary
Conclusion
Business Analytics Foundation R tools
Business Analytics Foundation With R Tools
Business Analytics Foundation With R Tools
Objectives
Analytics
Places Where Analytics is Applied
Topics Covered
Topics Covered (contd)
Career Path
Thank You
Introduction to Analytics
Introduction to Analytics
analytics vs analysis
What is Analytics
Popular Tools
Role of a Data Scientist
Data Analytics Methodology
Problem Definition
Summarizing Data
Data collection
Data Dictionary
Outlier Treatment
Quiz
Statistical Concepts And Their Application In Business
Statistical Concepts And Their Application In Business
Descriptive Statistics
Probability Theory
Tests of Significance
Non parametric Testing
Quiz
Basic Analytic Techniques Using R
Introduction
Data Exploration
Data Visualization
Pie Charts
Correlation
Analysis of variance
Chi squared test
T test
Summary
Quiz
Predictive Modelling Techniques
Predictive Modelling Techniques
Regression Analysis and Types of regression models
Linear Regression
Coefficient of determination R
How good is the model
How to find Liner regression equation
Commands to perform linear regression
Linear regression to predict sales
Case Study Linear Regression
Case Study Classification
Logistic regression
Example Logistic regression in R
Logistic Regression Predicting recurrent visits to a web site
Cluster Analysis
Command to perform clustering in R
Hierarchical Clustering
Case Study Implement K means and Hierarchical Clustering
Time Series
Cyclical versus seasonal analysis
Decomposition of Time Series
Case Study Time Series Analysis
Decomposing Non Seasonal Time Series
Exponential Smoothing
Advantages and Disadvantages of Exponential Smoothing
Exponential smoothing and forecasting in R
Example Holt Winters
White Noise
Correlogram Analysis
Box Jenkins forecasting Models
Case Study Time Series Data using ARMA
Business Case
Summary
Thank You
Java Essentials for Hadoop
Essentials of Java for Hadoop
Essentials of Java for Hadoop
Objectives
Java Definition
Java Virtual Machine (JVM)
Working of Java
Running a Basic Java Program
Running a Basic Java Program (contd)
Running a Basic Java Program in NetBeans IDE
BASIC JAVA SYNTAX
Data Types in Java
Variables in Java
Naming Conventions of Variables
Type Casting
Operators
Mathematical Operators
Unary Operators
Relational Operators
Logical or Conditional Operators
Bitwise Operators
Static Versus Non Static Variables
Static Versus Non Static Variables (contd)
Statements and Blocks of Code
Flow Control
If Statement
Variants of if Statement
Nested If Statement
Switch Statement
Switch Statement (contd)
Loop Statements
Loop Statements (contd)
Break and Continue Statements
Basic Java Constructs
Arrays
Arrays (contd)
JAVA CLASSES AND METHODS
Classes
Objects
Methods
Access Modifiers
Summary
Thank You
Java Constructors
Java Constructors
Objectives
Features of Java
Classes Objects and Constructors
Constructors
Constructor Overloading
Constructor Overloading (contd)
PACKAGES
Definition of Packages
Advantages of Packages
Naming Conventions of Packages
INHERITANCE
Definition of Inheritance
Multilevel Inheritance
Hierarchical Inheritance
Method Overriding
Method Overriding(contd)
Method Overriding(contd)
ABSTRACT CLASSES
Definition of Abstract Classes
Usage of Abstract Classes
INTERFACES
Features of Interfaces
Syntax for Creating Interfaces
Implementing an Interface
Implementing an Interface(contd)
INPUT AND OUTPUT
Features of Input and Output
Systeminread() Method
Reading Input from the Console
Stream Objects
String Tokenizer Class
Scanner Class
Writing Output to the Console
Summary
Thank You
Essential Classes and Exceptions in Java
Essential Classes and Exceptions in Java
Objectives
The Enums in Java
Program Using Enum
Array List
ArrayList Constructors
Methods of Array List
ArrayList Insertion
ArrayList Insertion (contd)
Iterator
Iterator (contd)
ListIterator
ListIterator (contd)
Displaying Items Using List Iterator
For Each Loop
For Each Loop (contd)
Enumeration
Enumeration (contd)
HASHMAPS
Features of Hashmaps
Hashmap Constructors
Hashmap Methods
Hashmap Insertion
HASHTABLE CLASS
Hashtable Class an Constructors
Hashtable Methods
Hashtable Methods
Hashtable Insertion and Display
Hashtable Insertion and Display (contd)
EXCEPTIONS
Exception Handling
Exception Classes
User Defined Exceptions
Types of Exceptions
Exception Handling Mechanisms
Try Catch Block
Multiple Catch Blocks
Throw Statement
Throw Statement (contd)
User Defined Exceptions
Advantages of Using Exceptions
Error Handling and finally block
Summary
Thank You
Apache Spark and Scala- Online Flexipass
Course Overview
Introduction
Course Objectives
Course Overview
Target Audience
Course Prerequisites
Value to the Professionals
Value to the Professionals (contd)
Value to the Professionals (contd)
s Covered
Conclusion
Introduction to Spark
Introduction
Objectives
Evolution of Distributed Systems
Need of New Generation Distributed Systems
Limitations of MapReduce in Hadoop
Limitations of MapReduce in Hadoop (contd)
Batch vs Real Time Processing
Application of Stream Processing
Application of In Memory Processing
Introduction to Apache Spark
Components of a Spark Project
History of Spark
Language Flexibility in Spark
Spark Execution Architecture
Automatic Parallelization of Complex Flows
Automatic Parallelization of Complex Flows Important Points
APIs That Match User Goals
Apache Spark A Unified Platform of Big Data Apps
More Benefits of Apache Spark
Running Spark in Different Modes
Installing Spark as a Standalone Cluster Configurations
Installing Spark as a Standalone Cluster Configurations
Demo Install Apache Spark
Demo Install Apache Spark
Overview of Spark on a Cluster
Tasks of Spark on a Cluster
Companies Using Spark Use Cases
Hadoop Ecosystem vs Apache Spark
Hadoop Ecosystem vs Apache Spark (contd)
Quiz
Summary
Summary (contd)
Conclusion
Introduction to Programming in Scala
Introduction
Objectives
Introduction to Scala
Features of Scala
Basic Data Types
Basic Literals
Basic Literals (contd)
Basic Literals (contd)
Introduction to Operators
Types of Operators
Use Basic Literals and the Arithmetic Operator
Demo Use Basic Literals and the Arithmetic Operator
Use the Logical Operator
Demo Use the Logical Operator
Introduction to Type Inference
Type Inference for Recursive Methods
Type Inference for Polymorphic Methods and Generic Classes
Unreliability on Type Inference Mechanism
Mutable Collection vs Immutable Collection
Functions
Anonymous Functions
Objects
Classes
Use Type Inference, Functions, Anonymous Function, and Class
Demo Use Type Inference, Functions, Anonymous Function and Class
Traits as Interfaces
Traits Example
Collections
Types of Collections
Types of Collections (contd)
Lists
Perform Operations on Lists
Demo Use Data Structures
Maps
Maps Operations
Pattern Matching
Implicits
Implicits (contd)
Streams
Use Data Structures
Demo Perform Operations on Lists
Quiz
Summary
Summary (contd)
Conclusion
Using RDD for Creating Applications in Spark
Introduction
Objectives
RDDs API
Features of RDDs
Creating RDDs
Creating RDDsReferencing an External Dataset
Referencing an External DatasetText Files
Referencing an External DatasetText Files (contd)
Referencing an External DatasetSequence Files
Referencing an External DatasetOther Hadoop Input Formats
Creating RDDsImportant Points
RDD Operations
RDD Operations Transformations
Features of RDD Persistence
Storage Levels Of RDD Persistence
Choosing The Correct RDD Persistence Storage Level
Invoking the Spark Shell
Importing Spark Classes
Creating the Spark Context
Loading a File in Shell
Performing Some Basic Operations on Files in Spark Shell RDDs
Packaging a Spark Project with SBT
Running a Spark Project With SBT
Demo Build a Scala Project
Build a Scala Project
Demo Build a Spark Java Project
Build a Spark Java Project
Shared Variables Broadcast
Shared Variables Accumulators
Writing a Scala Application
Demo Run a Scala Application
Run a Scala Application
Demo Write a Scala Application Reading the Hadoop Data
Write a Scala Application Reading the Hadoop Data
Demo Run a Scala Application Reading the Hadoop Data
Run a Scala Application Reading the Hadoop Data
Scala RDD Extensions
DoubleRDD Methods
PairRDD MethodsJoin
PairRDD MethodsOthers
Java PairRDD Methods
Java PairRDD Methods (contd)
General RDD Methods
General RDD Methods (contd)
Java RDD Methods
Java RDD Methods (contd)
Common Java RDD Methods
Spark Java Function Classes
Method for Combining JavaPairRDD Functions
Transformations in RDD
Other Methods
Actions in RDD
Key Value Pair RDD in Scala
Key Value Pair RDD in Java
Using MapReduce and Pair RDD Operations
Reading Text File from HDFS
Reading Sequence File from HDFS
Writing Text Data to HDFS
Writing Sequence File to HDFS
Using GroupBy
Using GroupBy (contd)
Demo Run a Scala Application Performing GroupBy Operation
Run a Scala Application Performing GroupBy Operation
Demo Run a Scala Application Using the Scala Shell
Run a Scala Application Using the Scala Shell
Demo Write and Run a Java Application
Write and Run a Java Application
Quiz
Summary
Summary (contd)
Conclusion
Running SQL Queries Using Spark SQL
Introduction
Objectives
Importance of Spark SQL
Benefits of Spark SQL
DataFrames
SQLContext
SQLContext (contd)
Creating a DataFrame
Using DataFrame Operations
Using DataFrame Operations (contd)
Demo Run SparkSQL with a Dataframe
Run SparkSQL with a Dataframe
Interoperating with RDDs
Using the Reflection Based Approach
Using the Reflection Based Approach (contd)
Using the Programmatic Approach
Using the Programmatic Approach (contd)
Demo Run Spark SQL Programmatically
Run Spark SQL Programmatically
Data Sources
Save Modes
Saving to Persistent Tables
Parquet Files
Partition Discovery
Schema Merging
JSON Data
Hive Table
DML Operation Hive Queries
Demo Run Hive Queries Using Spark SQL
Run Hive Queries Using Spark SQL
JDBC to Other Databases
Supported Hive Features
Supported Hive Features (contd)
Supported Hive Data Types
Case Classes
Case Classes (contd)
Quiz
Summary
Summary (contd)
Conclusion
Spark Streaming
Introduction
Objectives
Introduction to Spark Streaming
Working of Spark Streaming
Features of Spark Streaming
Streaming Word Count
Micro Batch
DStreams
DStreams (contd)
Input DStreams and Receivers
Input DStreams and Receivers (contd)
Basic Sources
Advanced Sources
Advanced Sources Twitter
Transformations on DStreams
Transformations on Dstreams (contd)
Output Operations on DStreams
Design Patterns for Using ForeachRDD
DataFrame and SQL Operations
DataFrame and SQL Operations (contd)
Checkpointing
Enabling Checkpointing
Socket Stream
File Stream
Stateful Operations
Window Operations
Types of Window Operations
Types of Window Operations Types (contd)
Join Operations Stream Dataset Joins
Join Operations Stream Stream Joins
Monitoring Spark Streaming Application
Performance Tuning High Level
Performance Tuning Detail Level
Demo Capture and Process the Netcat Data
Capture and Process the Netcat Data
Demo Capture and Process the Flume Data
Capture and Process the Flume Data
Demo Capture the Twitter Data
Capture the Twitter Data
Quiz
Summary
Summary (contd)
Conclusion
MongoDB Developer and Admininistrator- Online Flexipass
Course Introduction
Course Introduction
Table of Contents
Objectives
Course Overview
Value to Professionals and Organizations
Course Prerequisites
s Covered
Conclusion
Introduction to NoSQL databases
NoSQL Database Introduction
Objectives
What is NoSQL?
What is NoSQL?(contd)
Why NoSQL?
Difference Between RDBMS and NoSQL Databases
Benefits of NoSQL
Benefits of NoSQL (contd)
Types of NoSQL
Key Value Database
Key Value Database (contd)
Document Database
Document Database Example
Column Based Database
Column Based Database (contd)
Column Based Database (contd)
Column Based Database Example
Graph Database
Graph Database (contd)
CAP Theorem
CAP Theorem (contd)
Consistency
Availability
Partition Tolerance
Mongo DB as Per CAP
Quiz
Summary
Conclusion
MongoDB A Database for the Modern Web
MongoDB A Database for the Modern Web
Objectives
What is MongoDB?
JSON
JSON Structure
BSON
MongoDB Structure
Document Store Example
MongoDB as a Document Database
Transaction Management in MongoDB
Easy Scaling
Scaling Up vs Scaling Out
Vertical Scaling
Horizontal Scaling
Features of MongoDB
Secondary Indexes
Replication
Replication (contd)
Memory Management
Replica Set
Auto Sharding
Aggregation and MapReduce
Collection and Database
Schema Design and Modeling
Reference Data Model
Reference Data Model Example
Embedded Data Model
Embedded Data Model Example
Data Types
Data Types (contd)
Data Types (contd)
Core Servers of MongoDB
MongoDB's Tools
Installing MongoDB on Linux
Installing MongoDB on Linux
Installing MongoDB on Windows
Installing MongoDB on Windows
Starting MongoDB On Linux
Starting MongoDB On Linux
Starting MongoDB On Windows
Starting MongoDB On Windows
Use Cases
Use Cases (contd)
Quiz
Summary
Conclusion
CRUD Operations in MongoDB
CRUD Operations in MongoDB
Objectives
Data Modification in MongoDB
Batch Insert in MongoDB
Ordered Bulk Insert
Performing Ordered Bulk Insert
Performing Ordered Bulk Insert
Unordered Bulk Insert
Performing Un ordered Bulk Insert
Performing Un ordered Bulk Insert
Inserts Internals and Implications
Performing an Insert Operation
Performing an Insert Operation
Retrieving the documents
Specify Equality Condition
Retrieving Documents by Find Query
Retrieving Documents by Find Query
$in, $or , and AND Conditions
$or Operator
Specify AND/OR Conditions
Retrieving Documents by Using FindOne, AND/OR Conditions
Retrieving Documents by Using FindOne, AND/OR Conditions
Regular Expression
Array Exact Match
Array Projection Operators
Retrieving Documents for Array Fields
Retrieving Documents for Array Fields
$Where Query
Cursor
Cursor (contd)
Cursor (contd)
Retrieving Documents Using Cursor
Retrieving Documents Using Cursor
Pagination
Pagination Avoiding Larger Skips
Advance query option
Update Operation
Updating Documents in MongoDB
Updating Documents in MongoDB
$SET
Updating Embedded Documents in MongoDB
Updating Embedded Documents in MongoDB
Updating Multiple Documents in MongoDB
Updating Multiple Documents in MongoDB
$Unset and $inc Modifiers
$inc modifier to increment and decrement
$inc modifier to increment and decrement
Replacing Existing Document with New Document
Replacing Existing Document with New Document
$Push and $addToSet
Positional Array Modification
Adding Elements into Array Fields
Adding Elements into Array Fields
Adding Elements to Array Fields Using AddToSet
Adding Elements to Array Fields Using AddToSet
Performing AddToSet
Performing AddToSet
Upsert
Removing Documents
Performing Upsert and Remove Operation
Performing Upsert and Remove Operation
Quiz
Summary
Conclusion
Indexing and Aggregation
Indexing and Aggregation
Objectives
Introduction to Indexing
Types of Index
Properties of Index
Single Field Index
Single Field Index on Embedded Document
Compound Indexes
Index Prefixes
Sort Order
Ensure Indexes Fit RAM
Multi Key Indexes
Compound Multi Key Indexes
Hashed Indexes
TTL Indexes
Unique Indexes
Sparse Indexes
DemoCreate Compound, Sparse, and Unique Indexes
Demo Create Compound, Sparse, and Unique Indexes
Text Indexes
DemoCreate Single Field and Text Index
DemoCreate Single Field and Text Index
Text Search
Index Creation
Index Creation (contd)
Index Creation on Replica Set
Remove Indexes
Modify Indexes
DemoDrop and Index from a Collection
DemoDrop and Index from a Collection
Rebuild Indexes
Listing Indexes
DemoRetrieve Indexes for a Collection and Database
DemoRetrieve Indexes for a Collection and Database
Measure Index Use
DemoUse Mongo Shell Methods to Monitor Indexes
DemoUse Mongo Shell Methods to Monitor Indexes
Control Index Use
DemoUse the Explain, $Hint and $Natural Operators to Create Index
DemoUse the Explain, $Hint and $Natural Operators to Create Index
Index Use Reporting
Geospatial Index
DemoCreate Geospatial Index
DemoCreate Geospatial Index
MongoDBs Geospatial Query Operators
DemoUse Geospatial Index in a Query
DemoUse Geospatial Index in a Query
$GeoWith Operator
Proximity Queries in MongoDB
Aggregation
Aggregation (contd)
Pipeline Operators and Indexes
Aggregate Pipeline Stages
Aggregate Pipeline Stages (contd)
Aggregation Example
DemoUse Aggregate Function
DemoUse Aggregate Function
MapReduce
MapReduce (contd)
MapReduce (contd)
DemoUse MapReduce in MongoDB
DemoUse MapReduce in MongoDB
Aggregation Operations
DemoUse Distinct and Count Methods
DemoUse Distinct and Count Methods
Aggregation Operations (contd)
DemoUse the Group Function
DemoUse the Group Function
Quiz
Summary
Conclusion
Replication and Sharding
Replication and Sharding
Objectives
Introduction to Replication
Master Slave Replication
Replica Set in MongoDB
Replica Set in MongoDB (contd)
Automatic Failover
Replica Set Members
Priority Replica Set Members
Hidden Replica Set Members
Delayed Replica Set Members
Delayed Replica Set Members (contd)
DemoStart a Replica Set
Demo Start a Replica Set
Write Concern
Write Concern (contd)
Write Concern Levels
Write Concern for a Replica Set
Modify Default Write Concern
Read Preference
Read Preference Modes
Blocking for Replication
Tag Set
Configure Tag Sets for Replica set
Replica Set Deployment Strategies
Replica Set Deployment Strategies (contd)
Replica Set Deployment Patterns
Oplog File
Replication State and Local Database
Replication Administration
DemoCheck a Replica Set Status
DemoCheck a Replica Set Status
Sharding
When to Use Sharding?
What is a Shard?
What is a Shard Key
Choosing a Shard Key
Ideal Shard Key
Range Based Shard Key
Hash Based Shrading
Impact of Shard Keys on Cluster Operation
Production Cluster Architecture
Config Server Availability
Production Cluster Deployment
Deploy a Sharded Cluster
Add Shards to a Cluster
DemoCreate a Sharded Cluster
DemoCreate a Sharded Cluster
Enable Sharding for Database
Enable Sharding for Collection
Enable Sharding for Collection (contd)
Maintaining a Balanced Data Distribution
Splitting
Chunk Size
Special Chunk Type
Shard Balancing
Shard Balancing (contd)
Customized Data Distribution with Tag Aware Sharding
Tag Aware Sharding
Add Shard Tags
Remove Shard Tags
Quiz
Summary
Conclusion
Apache Cassandra
Course Overview
Course Overview
Course Objectives
Course Overview
Target Audience
Course Prerequisites
Value to the Professionals
s Covered
Conclusion
Apache Cassandra L Overview Big Data and NoSQL Database
Overview of Big Data and NoSQL Database
Course Map
Objectives
The Vs of Big Data
Volume
Data Sizes Terms
Velocity
Variety
Data Evolution
Features of Big Data
Big Data Use Cases
Big Data Analytics
Traditional Technology vs Big Data Technology
Apache Hadoop
HDFS
MapReduce
NoSQL Databases
Brewers CAP Principle
Approaches to NoSQL Databases Types
Quiz
Summary
Conclusion
Introduction to Cassandra
Introduction to Cassandra
Course Map
Objectives
Introducing Cassandra
Behind the Name
History of Cassandra
Main Features of Cassandra
When is Cassandra Used
Simple Cassandra Program
Cassandra Command Line Interface
Advantages of Cassandra
Limitations of Cassandra
VMware
Simplilearn Virtual Machine
PuTTY
WinSCP
Demo Install and Setup VM
Demo Install and Setup VM
Quiz
Summary
Conclusion
Cassandra Architecture
Cassandra Architecture
Course Map
Objectives
Architecture Requirements of Cassandra
Cassandra Architecture
Cassandra Architecture (contd)
Effects of the Architecture
Cassandra Write Process
Rack
Cassandra Read Process
Example of Cassandra Read Process
Data Partitions
Replication in Cassandra
Network Topology
Snitches
Gossip Protocol
Seed Nodes
Configuration
Virtual Nodes
Token Generator
Example of Token Generator
Failure Scenarios Node Failure
Failure Scenarios Disk Failure
Failure Scenarios Rack Failure
Failure Scenarios Data Center Failure
Quiz
Summary
Conclusion
Cassandra Installation and Configuration
Cassandra Installation and Configuration
Course Map
Objectives
Cassandra Versions
Steps to Install and Configure Cassandra on Ubuntu System
Step Operating System Selection
Step Machine Selection
Step Preparing for Installation
Step Setup Repository
Step Install Cassandra
Step Check the Installation
Step Configuring Cassandra
Step Configuration for a Single Node Cluster
Step Configuration for a Multi Node and Multi Datacenter Clusters
Step Setup Property File
Step Configuration for a Production Cluster
Step Setup Gossiping Property File
Step Starting Cassandra Services
Step Connecting to Cassandra
Installing on CentOS
Demo Installing and Configuring Cassandra on Ubuntu
Demo Installing and Configuring Cassandra on Ubuntu
Quiz
Summary
Conclusion
Cassandra Data Model
Cassandra Data Model
Course Map
Objectives
Cassandra Data Model
Cassandra Data Model Components
Keyspaces
Tables
Columns
UUID and TimeUUID
Counter
Compound Keys
Indexes
Collection Columns
Collection Columns Set
DDL and DML Statements
DDL Statements
DML Statements INSERT
DML Statements UPDATE
DML Statements COPY
DML Statements SELECT
SELECT Statements Restrictions
Valid and Invalid SELECT Statements Example
DML Statements DELETE
Demo Data Definition and Data Manipulation Statements
Demo Data Definition and Data Manipulation Statements
Demo Create a Table with Composite Key
Demo Create a Table with Composite Key
Demo Collection Columns in Cassandra
Demo Collection Columns in Cassandra
Quiz
Summary
Conclusion
Apache Storm
Introduction
Introduction
Course Objectives
Course Overview
Target Audience
Prerequisites for the Course
s Covered
Conclusion
Big Data Overview
Big Data Overview
Objectives
Big Data
Vs of Big Data
Data Volume
Data Sizes
Velocity of Data
Variety of Data
Data Evolution
Features of Big data
Industry Examples
Big data Analysis
Technology Comparison
Apache Hadoop
HDFS
MapReduce
Real Time Big data
Real Time Big data Examples
Real Time Big data Tools
Zookeeper
Quiz
Summary
Conclusion
Introduction to Storm
Introduction to Storm
Objectives
Apache Storm
Uses of Storm
What is a Stream
Industry use cases for STORM
STORM Data Model
Storm Architecture
Storm Processes
Sample Program
Storm Components
Storm Spout
Storm Bolt
Storm Topology
Storm Example
Serialization Deserialization
Submitting a Job to Storm
Types of Topologies
Installing Ubuntu VM and connecting with Putty Demo
Quiz
Summary
Conclusion
Installation and Configuration
Installation and Configuration
Objectives
Storm Versions
OS selection
Machine Selection
Preparing for Installation
Download Kafka
Download Storm
Install Kafka Demo
Install Storm Demo
Setting Up Multi node Storm Cluster
Quiz
Summary
Conclusion
Storm Advanced Concepts
Storm Advanced Concepts
Objectives
Types of Spouts
Structure of Spout
Structure of Bolt
Stream Groupings
Reliable Processing in Storm
Ack and Fail
Ack Timeout
Anchoring
Topology Lifecycle
Data Ingestion in Storm
Data Ingestion in Storm Example
Data Ingestion in Storm Check Output
Screen Shots for Real Time Data Ingestion
Spout Definition
Bolt Definition
Topology Connecting Spout and Bolt
Wrapper Class
Quiz
Summary
Conclusion
Storm Interfaces
Storm Interfaces
Objectives
Storm Interfaces
Java Interface to Storm
Compile and run a Java interface to Storm Demo
Spout Interface
IRichSpout Methods
BaseRichSpout Methods
OutputFieldsDeclarer Interface
Spout Definition Example
Bolt Interface
Irichbolt Methods
Baserichbolt Methods
Ibasicbolt Methods
Bolt Interface Example
Bolt Interface Example
Topology Interface
Topology Builder Methods
Bolt Declarer Methods
Storm Submitter Methods
Topology Builder Example
Apache Kafka Recap
Kafka Data Model
Apache Cassandra Recap
Real Time Data Analysis Platform
Kafka Interface to Storm
Kafka Spout
Kafka Spout Configuration
Kafka Spout Schemes
Using Kafka Spout in Storm
Storm Interface to Cassandra
Insert or Update Cassandra
Setting Up Cassandra Session
Insert or Update Data into Cassandra from Bolt
Kafka Storm Cassandra
Quiz
Summary
Conclusion
Java Essential for Hadoop
Essentials of Java for Hadoop
Essentials of Java for Hadoop
Objectives
Java Definition
Java Virtual Machine (JVM)
Working of Java
Running a Basic Java Program
Running a Basic Java Program (contd)
Running a Basic Java Program in NetBeans IDE
BASIC JAVA SYNTAX
Data Types in Java
Variables in Java
Naming Conventions of Variables
Type Casting
Operators
Mathematical Operators
Unary Operators
Relational Operators
Logical or Conditional Operators
Bitwise Operators
Static Versus Non Static Variables
Static Versus Non Static Variables (contd)
Statements and Blocks of Code
Flow Control
If Statement
Variants of if Statement
Nested If Statement
Switch Statement
Switch Statement (contd)
Loop Statements
Loop Statements (contd)
Break and Continue Statements
Basic Java Constructs
Arrays
Arrays (contd)
JAVA CLASSES AND METHODS
Classes
Objects
Methods
Access Modifiers
Summary
Thank You
Java Constructors
Java Constructors
Objectives
Features of Java
Classes Objects and Constructors
Constructors
Constructor Overloading
Constructor Overloading (contd)
PACKAGES
Definition of Packages
Advantages of Packages
Naming Conventions of Packages
INHERITANCE
Definition of Inheritance
Multilevel Inheritance
Hierarchical Inheritance
Method Overriding
Method Overriding(contd)
Method Overriding(contd)
ABSTRACT CLASSES
Definition of Abstract Classes
Usage of Abstract Classes
INTERFACES
Features of Interfaces
Syntax for Creating Interfaces
Implementing an Interface
Implementing an Interface(contd)
INPUT AND OUTPUT
Features of Input and Output
Systeminread() Method
Reading Input from the Console
Stream Objects
String Tokenizer Class
Scanner Class
Writing Output to the Console
Summary
Thank You
Essential Classes and Exceptions in Java
Essential Classes and Exceptions in Java
Objectives
The Enums in Java
Program Using Enum
Array List
Array List Constructors
Methods of Array List
Array List Insertion
Array List Insertion (contd)
Iterator
Iterator (contd)
List Iterator
List Iterator (contd)
Displaying Items Using List Iterator
For Each Loop
For Each Loop (contd)
Enumeration
Enumeration (contd)
HASHMAPS
Features of Hash maps
Hash map Constructors
Hash map Methods
Hash map Insertion
HASHTABLE CLASS
Hash table Class an Constructors
Hash table Methods
Hash table Methods
Hash table Insertion and Display
Hash table Insertion and Display (contd)
EXCEPTIONS
Exception Handling
Exception Classes
User Defined Exceptions
Types of Exceptions
Exception Handling Mechanisms
Try Catch Block
Multiple Catch Blocks
Throw Statement
Throw Statement (contd)
User Defined Exceptions
Advantages of Using Exceptions
Error Handling and finally block
Summary
Thank You
Apache Kafka
Course Introduction
Course Objectives
Course Overview
Target Audience
Prerequisites
s Covered
Conclusion
Big Data Overview
Big Data Overview
Objectives
Big Data Introduction
The Three Vs of Big Data
Data Volume
Data Sizes
Data Velocity
Data Variety
Data Evolution
Features of Big data
Industry Examples
Big Data Analysis
Technology Comparison
Stream
Apache Hadoop
Hadoop Distributed File System
MapReduce
Real Time Big Data Tools
Apache Kafka
Apache Storm
Apache Spark
Apache Cassandra
Apache Base
Real Time Big Data ToolsUses
Real Time Big Data Use Cases
Quiz
Summary
Conclusion
Introduction to Zookeeper
Introduction to ZooKeeper
Objectives
Zookeeper Introduction
Distributed Applications
Challenges of Distributed Applications
Partial Failures
Race Conditions
Deadlocks
Inconsistencies
ZooKeeper Characteristics
ZooKeeper Data Model
Types of Znodes
Sequential Znodes
VMware
Simplilearn Virtual Machine
PuTTY
WinSCP
Demo Install and Setup VM
Demo Install and Setup VM
ZooKeeper Installation
ZooKeeper Configuration
ZooKeeper Command Line Interface
ZooKeeper Command Line Interface Commands
ZooKeeper Client APIs
ZooKeeper Recipe Handling Partial Failures
ZooKeeper Recipe Leader Election
Quiz
Summary
Conclusion
Introduction to Kafka
Introduction to Kafka
Objectives
Apache Kafka Introduction
Kafka History
Kafka Use Cases
Aggregating User Activity Using Kafka Example
Kafka Data Model
Topics
Partitions
Partition Distribution
Producers
Consumers
Kafka Architecture
Types of Messaging Systems
Queue System Example
Publish Subscribe System Example
Brokers
Kafka Guarantees
Kafka at LinkedIn
Replication in Kafka
Persistence in Kafka
Quiz
Summary
Conclusion
Installation and Configuration
Installation and Configuration
Objectives
Kafka Versions
OS Selection
Machine Selection
Preparing for Installation
Demo Kafka Installation and Configuration
Demo Kafka Installation and Configuration
Demo Creating and Sending Messages
Demo Creating and Sending Messages
Stop the Kafka Server
Setting up Multi Node Kafka Cluster Step
Setting up Multi Node Kafka Cluster Step
Setting up Multi Node Kafka Cluster Step
Setting up Multi Node Kafka Cluster Step
Setting up Multi Node Kafka Cluster Step
Setting up Multi Node Kafka Cluster Step
Quiz
Summary
Conclusion
Kafka Interfaces
Kafka Interfaces
Objectives
Kafka Interfaces Introduction
Creating a Topic
Modifying a Topic
kafka topicssh Options
Creating a Message
kafka console producersh Options
Creating a Message Example
Creating a Message Example
Reading a Message
kafka console consumersh Options
Reading a MessageExample
Java Interface to Kafka
Producer Side API
Producer Side API Example Step
Producer Side API Example Step
Producer Side API Example Step
Producer Side API Example Step
Producer Side API Example Step
Consumer Side API
Consumer Side API Example Step
Consumer Side API Example Step
Consumer Side API Example Step
Consumer Side API Example Step
Consumer Side API Example Step
Compiling a Java Program
Running the Java Program
Java Interface Observations
Exercise Tasks
Exercise Tasks (contd)
Exercise Solutions
Exercise Solutions (contd)
Exercise Solutions (contd)
Exercise Tasks
Exercise Tasks (contd)
Exercise Solutions
Exercise Solutions (contd)
Exercise Solutions (contd)
Exercise Solutions (contd)
Exercise Solutions (contd)
Quiz
Summary
Thank You
Impala Training
Course Introduction
Course Introduction
Table of Contents
Objectives
Course Overview
Value to Professionals and Organizations
Course Prerequisites
Conclusion
Introduction to Impala
Introduction to Impala
Objectives
What is Impala
Benefits of Impala
Benefits of Impala (Contd)
Exploratory Business Intelligence
Impala Installation
Demo Using Cloudera Manager for Impala
Demo Using Cloudera Manager for Impala (contd)
Starting and Stopping Impala
Demo Starting Impala from Command Line
Demo Starting Impala from Command Line (contd)
Data Storage
Managing Metadata
Controlling Access to Data
Impala Shell Commands and Interface
Impala Shell Commands and Interface (contd)
Demo Launching Impala Shell and Shell Command
Demo Launching Impala Shell and Shell Command (contd)
Quiz
Summary
Summary (contd)
Conclusion
Querying with Hive and Impala
Querying with Hive and Impala
Objectives
SQL Language Statements
DDL Statements
DML Statements
CREATE DATABASE
CREATE TABLE
CREATE TABLE Examples
Internal and External Tables
Loading Data into Impala Table
ALTER TABLE
DROP TABLE
DROP DATABASE
DESCRIBE Statement
EXPLAIN Statement
SHOW TABLE Statement
INSERT Statement
INSERT Statement Examples
SELECT Statement
Data Type
Data Type (contd)
Operators
Functions
CREATE VIEW in Impala
Hive and Impala Query Syntax
Demo Using Impala Shell for DDL and DDML SQL Statements
Demo Using Impala Shell for DDL and DDML SQL Statements (contd)
Quiz
Summary
Conclusion
Data Storage and File Format
Data Storage and File Format
Objectives
Partitioning Tables
SQL Statements for Partitioned Tables
File Format and Performance Considerations
Choosing File Type and Compression Technique
Demo File Formats and Compression Techniques
Demo File Formats and Compression Techniques (contd)
Quiz
Summary
Conclusion
Working with Impala
Working with Impala
Objectives
Impala Architecture
Impala Daemon
Impala Statestore
Impala Catalog Service
Query Execution Flow in Impala
User Defined Functions
Hive UDFs with Impala
Demo UDF in Impala
Demo UDF in Impala(contd)
Improving Impala Performance
Quiz
Summary
Conclusion
Big Data Architect Master's Program at Simplilearn Faculty details
Other courses offered by Simplilearn
Big Data Architect Master's Program at Simplilearn Students Ratings & Reviews
- 4-52