Hadoop Administrator
- Offered bySkillsoft
Hadoop Administrator at Skillsoft Overview
Duration | 31 hours |
Total fee | ₹10,618 |
Mode of learning | Online |
Difficulty level | Intermediate |
Credential | Certificate |
Future job roles | CRUD, .Net, CSR, Credit risk, Senior Software Developer |
Hadoop Administrator at Skillsoft Highlights
- Certification from Naukri Learning, Content aligned with most Certifying bodies
- 400mn+ users & used by Professionals in 70% of Fortune 500 companies
Hadoop Administrator at Skillsoft Course details
- Analytics professionals
- Research professionals
- Data scientists
- Anyone wishing to gain a knowledge of Apache Spark
- Unlimited Access to Online Content for six months
- Course Completion certificate - renowned globally
- 400mn+ users, World's No 1 & trained 70% of Fortune 500 companies
- Career boost for students and professionals
- Hadoop is an Apache Software Foundation project and open source software platform for scalable, distributed computing. Hadoop can provide fast and reliable analysis of both structured data and unstructured data. In this course you will learn about the design principles, the cluster architecture, considerations for servers and operating systems, and how to plan for a deployment. This learning path can be used as part of the preparation for the Cloudera Certified Administrator for Apache Hadoop (CCA-500) exam.
- Apache Spark Fundamentals course introduces to the various components of the spark framework to efficiently process, visualize and analyze data. The course takes you through spark applications using Python, Scala and Java. You will also learn about the apache spark programming fundamentals like resilient distributed datasets and check which operations to be used to do a transformation operation on the RDD. This will also show you how to save and load data from different data sources like different type of files, RDBMS databases and NO-SQL. At the end of the course, you will explore effective spark application and execute it on Hadoop cluster to make informed business decisions.
Hadoop Administrator at Skillsoft Curriculum
Apache Spark Fundamentals
Programming and Deploying Apache Spark Applications
Start the course
Describe apache spark and the main components of a spark application
Download and install apache spark on windows 8.1 pro n
Download and install apache spark on mac os x yosemite
Download and install java development kit or jdk 8 and build apache spark using simple build tool or sbt on mac os x yosemite
Use the spark shell for analyzing data interactively
Link an application to spark
Create a spark context to initialize apache spark
Introduce resilient distributed datasets or rdds and create a parallelized collection to generate an rdd
Load external datasets to create resilient distributed datasets or rdds
Distinguish transformations and actions, describe some of the transformations supported by spark, and use transformations
Describe some of the actions supported by spark and use the actions
Use anonymous function syntax and use static methods in a global singleton to pass functions to spark
Work with key-value pairs
Persist spark rdds
Use broadcast variables in a spark operation
Use accumulators in spark operations
Use different formats for loading and saving spark data
Use basic spark sql for data queries in a spark application
Use basic spark graphx to work with graphs in a spark application
Describe how spark applications run in a cluster
Deploy a spark application to a cluster
Unit test a spark application
Describe how to monitor a spark application or cluster with web uis
Describe options for scheduling resources across applications in a spark cluster
Describe how to enable a fair scheduler for fair sharing within an application in a spark cluster
Configure fair scheduler pool properties for a spark context within a cluster
Practice programming and deploying a spark application to a cluster
Hadoop Operations
Designing Hadoop Clusters
Hadoop in the Cloud
Deploying Hadoop Clusters
Hadoop Cluster Availability
Securing Hadoop Clusters Operating Hadoop Clusters
Stabilizing Hadoop Clusters
Capacity Management for Hadoop Clusters
Performance Tuning of Hadoop Clusters
Cloudera Manager and Hadoop Clusters