On The Job Training, BigDatag for Students and Software Developers.

Get experience with some of the most common and challenging scenarios Hadoop administrators see in the real world, and become familiar with the most up-to-date details of the platform..

Lab Hours


160 hours

80 hours

Course Description

What is Big Data ?. Gartner defines Big Data as high volume,velocity and variety information assets that demand cost-effective,innovative forms of enhanced insight and decision making. According to IBM, 80% of data captured today is unstructured,from sensors used to gather climate information,posts to social media sites,digital pictures and videos,purchase transaction records and cell phone GPS signals,to name a few. All of this unstructured data is also Big Data.

What you'll learn ? In this course we will go through BigData use case examples and also learn how to manage BIGDATA using Hadoop and related technologies. We will learn how to install and configure a single-node Hadoop cluster and perform Hadoop Distributed File System (HDFS) and Hadoop Map-Reduce operations.

Hadoop Essentials teaches the fundamentals of setting up a Hadoop cluster, as well as the "soup" of related technologies like Hive, Pig and Oozie. In addition, you will learn how to write MapReduce programs using Java. Learn how to use Apache Spark as an alternative to traditional MapReduce processing

This BigData training will be suitable for software developers, architects, IT Services, deployment engineers, IT Support and development managers.


Course Prerequisites


Course Includes

Introduction To BigData
  • BigData Overview
  • BigData and Use Cases . IT Infrastructure Optimization . Advertising Analysis . Predictive Analysis . Customer Churn Analysis . Aadhar Project by Govt of India . Weather forecasting . Healthcare Analysis . Natural Resources Exploration
Hadoop Essentials
  • Introduction to Hadoop
  • Basic Architecture of Hadoop
  • Introduction to HDFS
  • Hadoop Map-Reduce Overivew
  • Hadoop Daemons
  • Hadooop works and Characteristics
  • Hadoop Flavors - Introduction to Apache, Cloudera,Hortonworks, MapR, IBM, Pivotal
  • Hadoop Eco-System - Introduction to Hbase, Hive, Yarn and Flume
Hadoop Distributed File System
  • Introduction to HDFS
  • HDFS Nodes
  • HDFS Master and Slave
  • Daemons
  • Data Storage Mechanism
  • Basic Architecture and Advance HDFS Architecture
  • Features and Characteristics
  • Introduction to HDFS
  • Introduction to MapReduce
  • MapReduce Concepts
  • MapReduce Terminologies
  • Daemons Insights
  • MapReduce Flow
Spark Essentials
  • Describe how Apache Spark and Hadoop fit together
  • List three motivations for using Spark
  • Describe and understand RDDs
  • Implement an application using the key Spark concepts
Lab Sessions:
  • Deploy Hadoop on a single node cluster
  • Perform Hadoop Distributed File System (HDFS) and Hadoop Map-Reduce operations
  • Describe and understand RDDs
  • Execute MapReduce Code in Linux Using Eclipse IDE
Go Home

Check-out our new Courses