Description
This internship will allow you to understand the challenges and contributions of Big Data as well as the technologies to implement it. You will learn to integrate massive volumes of structured and unstructured data via ETL, then analyze them using statistical models and dynamic dashboards.
Who is this training for ?
For whom ?
Dataminers, statistical researchers, developers, project managers, business intelligence consultants.
Prerequisites
Training objectives
Training program
- Understand the concepts and issues of Big Data
- Origins and definition of Big Data: BI facing the growth and diversity of data.
- Key market figures in the world and in France.
- The challenges of Big Data: ROI, organization, data confidentiality.
- An example of Big Data architecture.
- Big Data technologies
- What to remember.
- Synthesis of good practices.
- Bibliography.
- Manage structured and unstructured data
- Hadoop Distributed File System (HDFS) working principles.
- Import external data to HDFS.
- Perform SQL queries with HIVE.
- Use PIG to process data.
- Use ETL to industrialize the creation of massive data flows.
- Presentation of Talend For Big Data.
- Exercise: Implementing big data flows.
- Data analysis methods for Big Data
- Exploration methods.
- Segmentation and classification.
- Estimation and prediction.
- Model implementation.
- Exercise: Setting up analyzes with R software.
- Data visualization and concrete use cases
- Market restitution tools.
- Methodology for formatting reports.
- Contribution of Big Data to "Social Business".
- Measure e-reputation and brand awareness.
- Measure customer experience and satisfaction, optimize the customer journey.
- Exercise: rnInstallation and use of a Data Visualization tool to create dynamic analyses, recovery of data from social networks and creation of e-reputation analysis.
- Conclusion
- What to remember.
- Synthesis of good practices.
- Bibliography.