Unlock the potential of generative AI across all your managerial functions.
Log in
Or create your account
You have just added to your selection
Your cart is empty, See our trainings

Description

The continued growth of digital data within businesses and public organizations has given rise to the concept of “Big Data”. This term refers to the management and preservation of vast amounts of data, and the potential value they represent. This seminar addresses the specific challenges linked to Big Data as well as possible technical solutions for the management and processing of these masses of data. These solutions involve a break from traditional analysis methods due to the large quantity of data to be processed.

Who is this training for ?

For whom ?

IS Directors, IS Managers, Project Managers, Architects, Consultants or any person required to participate in a Big Data project.

Prerequisites

Basic knowledge of technical architectures.

Training objectives

  • Discover the main concepts of Big Data
  • Identify the economic issues
  • Evaluate the advantages and disadvantages of Big Data
  • Understand the main problems and potential solutions
  • Identify the main methods and fields of application of Big Data
  • Understand the advantages and constraints of Big Data
  • Training program

      • The origins of Big Data: a world of digital data, e-Health, chronology.
      • A definition by the four Vs: the provenance of data.
      • A rupture: changes in quantity, quality, habits.
      • The value of data: a change in importance.
      • Data as a raw material.
      • The fourth paradigm of scientific discovery.
      • The sequence of operations.
      • Acquisition.
      • Data collection: crawling, scraping.
      • Flow management event processing (Complex Event Processing, CEP).
      • Indexing the incoming flow.
      • Integration with old data.
      • Data quality: a fifth V? The different types of processing: research, learning (machine learning, transactional, data mining).
      • Other sequencing models: Amazon, e-Health.
      • One or more data repositories? From Hadoop to in-memory.
      • From tone analysis to knowledge discovery.
      • The architectural model of public and private Clouds.
      • XaaS services.
      • The objectives and advantages of Cloud architectures.
      • Infrastructure.
      • The equalities and differences between Cloud and Big Data.
      • Storage clouds.
      • Classification, security and confidentiality of data.
      • Structure as a classification criterion: unstructured, structured, semi-structured.
      • Classification according to life cycle: temporary or permanent data, active archives.
      • Difficulties in terms of security: increase in volumes, distribution.
      • Potential solutions.
      • The philosophy of open data and the objectives.
      • The liberation of public data.
      • The difficulties of implementation.
      • The essential characteristics of open data.
      • Areas of application.
      • The expected benefits.
      • Servers, disks, network and the use of SSD disks, the importance of network infrastructure.
      • Cloud architectures and more traditional architectures.
      • The advantages and difficulties.
      • The TCO.
      • Power consumption: servers (IPNM), disks (MAID).
      • Object storage: principle and advantages.
      • Object storage compared to traditional NAS and SAN storage.
      • Software architecture.
      • Implementation levels of data management storage.
      • The "Software Defined Storage".
      • Centralized architecture (Hadoop File System).
      • Peer-to-Peer architecture and 'mixed architecture.
      • Interfaces and connectors: S3, CDMI, FUSE, etc.
      • Future of other storage (NAS, SAN) compared to object storage.
      • Preservation over time in the face of increases in volume.
      • Backup, online or local? The traditional archive and the active archive.
      • Links with storage hierarchy management: future of magnetic tapes.
      • Multi-site replication.
      • The degradation of storage media.
      • Classification of analysis methods according to data volume and processing power.
      • Hadoop: the Map Reduce processing model.
      • The Hadoop ecosystem : Hive, Pig.
      • The difficulties of Hadoop.
      • Openstack and the Ceph data manager.
      • Complex Event Processing: an example?
      • From BI to Big Data.
      • Renewed decision-making and transactional: NoSQL databases.
      • Typology and examples.
      • Data ingestion and indexing.
      • Two examples: splunk and Logstash.
      • Open source crawlers.
      • Search and analysis: elasticsearch.
      • Learning: Mahout.
      • In-memory.
      • Visualization: real time or not, on the Cloud (Bime), comparison Qlikview, Tibco Spotfire , Tableau.
      • A general architecture of data mining via Big Data.
      • Anticipation: user needs in businesses, equipment maintenance.
      • Security: people, fraud detection (postal, taxes), the network.
      • Anticipation: user needs in businesses, equipment maintenance.
      • Recommendation.
      • Marketing analyzes and impact analyses.
      • Course analyses.
      • Video content distribution.
      • Big Data for the automotive industry? For the oil industry? Should we embark on a Big Data project? What future for data? Governance of data storage: role and recommendations , the data scientist, the skills of a Big Data project.
    • 832
    • 14 h

    Submit your review

    Translated By Google Translate