top of page

Introduction to Big Data

The purpose of this training is to understand the Big data landscape and its environment, and to become familiar with the different architectural components and programming models used for the evolutionary analysis of big data.


Introduction to big data training

Training aims

  • Understand the concept of Big Data

  • Identify the ecosystem and understand the associated technologies

  • Anticipate its integration into the company's IT activities

  • Introduction to the use of the Hadoop tool

Program :


Big Data environment

  • bigdata

  • Perform massively parallel MapReduce calculations

  • Perform distributed calculations as graphs with spark

  • Building a Big Data Strategy

Big Data process

  • Data acquisition

  • Data mining

  • Pre-processing data

  • Data analysis

  • Communicate the results

  • Turning knowledge into action

Getting started with Hadoop

  • The Hadoop ecosystem

  • The Hadoop Distributed File System

  • YARN: a resource manager for Hadoop

  • MapReduce: simple programming for great results

  • Cloud Computing: An Important Big Data Enabler

  • Cloud Service Models: An Exploration of Choices

  • Value of Hadoop and pre-built Hadoop images

  • Copy your data to the Hadoop Distributed File System (HDFS)

  • Run the WordCount program


This training is for those new to data science who want to understand and become familiar with the terminology and basic concepts behind big data problems, applications and systems.

bottom of page