- 32 hours of instructor-led training
- 20 hours of self-paced video
- Includes 4 real industry-based projects
- Prepares for Cloudera CCAH ‘CCA-500’ certification exam
- Includes 3 simulation exams aligned to ‘CCA-500’ certification exam
What is the focus of this course?
The Simplilearn Big Data and Hadoop Administrator course will prepare you for Cloudera’s CCAH ‘CCA-500’ certification and equip you with all the skills for your next Big Data admin assignment. This course covers the Core Hadoop distributions—Apache Hadoop and Vendor specific distribution—CDH (Cloudera Distribution of Hadoop).
You will learn the need for cluster management solutions, about Cloudera manager and its capabilities. It teaches you how to set up Hadoop cluster and its components such as Sqoop, Flume, Pig, Hive and Impala with basic or advanced configurations? The Hadoop administrator course also answers What is Hadoop’s Distributed File System, and its processing/computation frameworks? And How to plan, secure, safeguard, and monitor a cluster?
This course will help you understand all basic and advance concepts of Big Data and all technologies related to Hadoop stack and components within Hadoop Ecosystem.
What learning outcomes can be expected?
After completing this course, you will be able to:
- Understand the fundamentals of Big Data and its characteristics, various scalability options to help organizations manage Big Data.
- Master the concepts of the Hadoop framework; its architecture, working of Hadoop distributed file system and deployment of Hadoop cluster using core or vendor specific distributions.
- Learn about cluster management solutions such as Cloudera manager and its capabilities for setup, deploying, maintenance & monitoring of Hadoop Clusters.
- Learn Hadoop Administration activities
- Learn about computational frameworks for processing Big Data
- Learn about Hadoop clients, nodes for clients and web interfaces like HUE to work with Hadoop Cluster
- Learn about Cluster planning and tools for data ingestion into Hadoop clusters
- Learn about Hadoop components within Hadoop ecosystem like Hive, HBase, Spark and Kafka
- Understand security implementation to secure data and clusters.
- Learn about Hadoop cluster monitoring activities
Big Data career opportunities are on the rise, and Hadoop is quickly becoming a must-know technology for the following professionals:
- Systems administrators and IT managers
- IT administrators and operators
- IT Systems Engineer
- Data Engineer and database administrators
- Data Analytics Administrator
- Cloud Systems Administrator
- Web Engineer
Successful evaluation of one of the following 2 projects is a part of the certification eligibility criteria
Scalability: Deploying Multiple Clusters
Your company wants to set up a new cluster and has procured new machines; however, setting up clusters on new machines will take time. Meanwhile, your company wants you to set up a new cluster on the same set of machines and start testing the new cluster’s working and applications
Working with Cluster
Demonstrate your understanding of the following tasks (give the steps):
- Enabling and Disabling HA for namenode and resourcemanager in CDH
- Removing Hue service from your cluster, which has other services such as Hive, Hbase, HDFS, and YARN setup.
- Adding a user and granting read access to your cloudera cluster.
- Changing replication and blocksize of your cluster.
- Adding Hue as a service, logging in as user HUE, and downloading examples for hive, pig, job designer, etc.
For Further Practice we have 2 more projects to help you start your hadoop administrator journey.
Data Ingestion and Usage
Ingesting data from external structured databases into HDFS.
Working on Data on HDFS by loading it into Data warehouse package like Hive; using HiveQL for querying, analyzing, and loading data in another set of tables for further usage.
Your organization already has a large amount of data in RDBMS and has now set up a Big Data practice. It is interested in moving data from RDBMS into HDFS so that it can perform data analysis by using Software packages such as Apache Hive. The organization would like to leverage the benefits of HDFS and features such as auto replication and fault tolerance that HDFS offers.
Securing Data and Cluster
Protecting data stored in your Hadoop cluster by safeguarding it and backing it up.
Your organization has multiple Hadoop clusters and would like to safeguard its data on multiple clusters. The aim is to prevent data loss from accidental deletes and to make critical data available to users/applications even if one or more of these clusters are down.
Exam & certification
What do I need to do to unlock my Simplilearn certificate?
Complete 1 project and 1 simulation test with a minimum score of 80%.
Self-Paced Learning $ 399
- 180 days of access to high-quality, self-paced learning content designed by industry experts
- 90 days of access to 3+ instructor-led online training classes
- 180 days of access to high-quality, self-paced learning content
This school offers programs in:
Last updated August 10, 2017