Hadoop, installation et administration

Skills Campus

Tranning & certification center

4 Days
SII-315
1015 Views

Description

The Apache Hadoop platform makes it easier to create distributed applications. This internship will allow you to understand its architecture and give you the knowledge necessary to install, configure and administer a Hadoop cluster. You will also learn how to optimize it and maintain it over time.

Who is this training for ?

For whom ?

Hadoop cluster administrators, developers.

Prerequisites

Training objectives

Discover the concepts and issues related to HadoopOptimize the platform

Understand how the platform and its components work

Install the platform and manage it

Training program

Overview of the Apache Hadoop Framework

Big Data challenges and contributions of the Hadoop framework.

Presentation of the Hadoop architecture.

Description of the main components of the Hadoop platform.

Presentation of the main market distributions and complementary tools (Cloudera, MapR, Dataiku.

).

Advantages/disadvantages of the platform.

Hadoop cluster preparations and configuration

Hadoop Distributed File System (HDFS) working principles.

MapReduce working principles.

Cluster "type" design.

Hardware selection criteria.

Practical work Configuration of the Hadoop cluster.

Installing a Hadoop platform

Deployment type.

Installation of Hadoop.

Installation of other components (Hive, Pig, HBase, Flume.

).

Practical work Installation of a Hadoop platform and main components.

Managing a Hadoop cluster

Management of Hadoop cluster nodes.

TaskTracker, JobTracker for MapReduce.

Management of tasks via schedulers.

Management of logs.

Using a manager.

Practical work List jobs, queue status, job status, task management, access to the web UI.

Data management in HDFS

Import of external data (files, relational databases) to HDFS.

Handling of HDFS files.

Practical work Import external data with Flume, consult relational databases with Sqoop.

Advanced configuration

Authorization and security management.

Recovery from name node failure (MRV1).

NameNode high availability (MRV2/YARN).

Practical work Configuration of a service-level authentication (SLA) and an Access Control List (ACL).

Monitoring et optimisation Tuning

Monitoring (Ambari, Ganglia.

).

Benchmarking/profiling of a cluster.

Apache GridMix tools, Vaaidya .

Choose block size.

Other tuning options (use of compression, memory configuration.

).

Practical work Understand cluster monitoring and optimization commands as they come.

1015
28 h

Log in

Or create your account

You have just added to your selection

Description

Who is this training for ?

Training objectives

Training program

Submit your review

Training in our centers

SII-315

4 Days ( 28 hrs)

Training in your company

SII-315

4 Days ( 28 h)

On-demand training

Training

Certifications

Services

About Us

Log in

Or create your account

You have just added to your selection