Hadoop Administration


Hadoop is an open-source software framework for storing data and running applications. It is designed to scale up from single servers to thousands of machines.Hadoop is a Master/Slave architecture and needs a lot of memory and CPU bound.It is a very clever method for distributed computing and storage. For a long-time, people stored information on servers in big databases. And this mainly worked for the types and amount of data that people had. Basically, if you've some row- or column-based data, this works for you really well. However, in the last few years, certain companies wanted more data.

It has two main components:
JobTracker:This is the critical component in this architecture and monitors jobs that are running on the cluster.
TaskTracker:This runs tasks on each node of the cluster.

Training Objectives of Hadoop Developer
We provide the training through online by real time IT professionals. we market your profile in the IT industry. We provide the placement after completion of the training. We assist you to get certified on the course. We will give you quality training and We provide best quality material at the time of training.

Target Students / Prerequisites
Students must be belonging to IT Background and familiar with Concepts in Java and Linux

Course Content

Hadoop Architecture
Introduction to
Parallel Computer vs. Distributed Computing
How to install Hadoop cluster on multiple
Hadoop Daemons introduction: NameNode, DataNode, JobTracker, TaskTracker
Exploring HDFS (Hadoop Distributed File System) Exploring the HDFS Apache Web UI
NameNode architecture (EditLog, FsImage, location of replicas) Secondary NameNode architecture
DataNode architecture

MapReduce Architecture
Exploring JobTracker/TaskTracker
How a client submits a Map-Reduce job
Exploring Mapper/Reducer/Combiner
Shuffle: Sort & Partition
Input/output formats
Job Scheduling (FIFO, Fair Scheduler, Capacity Scheduler) Exploring the Apache MapReduce Web UI

Hadoop Developer Tasks
Writting a map-reduce programme
Reading and writing data using
Java Hadoop Eclipse integration
Mapper in details
Reducer in details
Using Combiners
Reducing Intermediate Data with Combiners
Writing Partitioners for Better Load
Balancing Sorting in HDFS
Searching in HDFS
Indexing in HDFS
Hands-On Exercise

Hadoop Administrative Tasks
Routine Administrative Procedures
Understanding dfsadmin and mradmin Block Scanner, Balancer
Health Check & Safe mode
DataNode commissioning/decommissioning
Monitoring and Debugging on a production
cluster NameNode Back up and Recovery
ACL (Access control list) Upgrading Hadoop

HBase Architecture
Introduction to Hbase
HBase vs. RDBMS
Exploring HBase Master & region server
Column Families and Regions
Basic Hbase shell commands.

Hive Architecture
Introduction to Hive
HBase vs Hive
Installation of Hive
HQL (Hive query language)
Basic Hive commands

Pig Architecture
Introduction to Pig
Installation of Pig on your system
Basic Pig commands
Hands-On Exercise

Sqoop Architecture
Introduction to Sqoop
Installation of Sqoop on your system
Import/Export data from RDBMS to HDFS
Import/Export data from RDBMS to Hbase
Import/Export data from RDBMS to Hive
Hands-On Exercise

Mini Project / POC ( Proof of Concept )
Facebook-Hive POC
Usages of Hadoop/Hive @ Facebook
Static & dynamic partitioning
UDF ( User defined functions )

About the Trainer

10 + Years of IT experience, provided training for over 1000+ professionals online Real-time IT experience which enables to provide dedicated quality training. Teaching is his passion vast experience in providing online training around the globe with good communication skills.