Ce que vous devez savoir avant
Vous commencez

Débute 14 July 2026 15:33

Se termine 14 July 2026

00 Jours

00 Heures

00 Minutes

00 Secondes

Data Engineering on AWS - A Streaming Data Pipeline Solution

Ingénierie des Données sur AWS - Une Solution de Pipeline de Données en Streaming Dans ce cours, vous apprendrez à construire des solutions d'analyse de données en streaming en utilisant les services AWS, y compris Amazon Kinesis, Amazon Data Firehose, et Amazon Managed Streaming pour Apache Kafka (Amazon MSK). Kinesis est un service de streaming.

via AWS Skill Builder

Non spécifié

Amélioration optionnelle disponible

Tous niveaux

Progressez à votre rythme

Free

Amélioration optionnelle disponible

Aperçu

In this course, you will learn to build a streaming data analytics solutions using AWS services, including Amazon Kinesis, Amazon Data Firehose, and Amazon Managed Streaming for Apache Kafka (Amazon MSK). Kinesis is a massively scalable and durable real-time data streaming service.

Amazon MSK offers a secure, fully managed, and highly available Apache Kafka service.

You will learn how Kinesis and Amazon MSK integrate with AWS services such as AWS Glue and AWS Lambda. The course addresses the streaming data ingestion, stream storage, and stream processing components of the data analytics pipeline.

You will also learn to apply security, performance, and cost management best practices to the operation of Kinesis and Amazon MSK.

The course is divided into different modules. The learning modules introduce new concepts and the AWS services you can use to build your solution.

Lab modules are in-depth, hands-on activities with step-by-step instructions for you to apply what you’ve learned.

Activities:

Interactive content, videos, knowledge checks, assessments, and hands-on labs

Course objectives:

Recognize an analytics customer challenge and describe the appropriate AWS solution for solving it featuring a streaming data architecture.
Describe data sources suitable for streaming applications and how that data is ingested.
Identify short-term and long-term storage services for streaming data.
Describe how to design and implement real-time data processing solutions.
Recognize how to serve streaming data for consumption by end users.
Describe how to optimize a streaming data pipeline using Amazon Kinesis, Amazon MSK, and Amazon Redshift.
Identify best practices for securing a streaming data pipeline.

Intended audience:

Data engineer
Data analyst
Data architect
Business intelligence engineer

Recommended skills:

2-3 years of experience in data engineering
1–2 years of hands-on experience with AWS services
Completed AWS Cloud Practitioner Essentials or equivalent
Completed Fundamentals of Analytics on AWS Part 1 and 2
Completed Data Engineering on AWS – Foundations

Course outline:

Module 1:

Building a Streaming Data Pipeline Solution

This course shows how to identify, select, and configure the appropriate AWS services for building a streaming data pipeline solution to meet a fictitious customer's business goals.

Introduction
Ingesting Data from Stream Sources
Storing Streaming Data
Processing Data
Analyzing Data
Final Assessment
Conclusion

Module 2:

Streaming Analytics with Amazon Managed Service for Apache Flink (Lab)

This lab is a step-by-step, hands-on activity to build a stream processing pipeline by ingesting clickstream data and enriching the clickstream data with catalog data stored in Amazon Simple Storage Service (Amazon S3). You perform analysis on the enriched data to identify the sales per category in real time and visualize the output.

Lab overview
Task 1:
Setting up Zeppelin notebook environment
Task 2:
Connect to the Amazon EC2 producer and start the clickstream generator
Task 3:
Import the Zeppelin notebook
Task 4:
Analytics development in Managed Apache Flink Studio with Zeppelin notebook
Task 5:
Understanding in-memory table creation in AWS Glue Data Catalog
Conclusion

Module 3:

Optimizing and Securing a Streaming Data Pipeline Solution

This course covers how to configure a fictitious customer's streaming data pipeline solution to increase efficiency, control costs, secure and protect the data, and govern the infrastructure.

Optimization
Security and Governance
Final Assessment

Ce que vous devez savoir avant Vous commencez

Data Engineering on AWS - A Streaming Data Pipeline Solution

Non spécifié

Tous niveaux

Free

Aperçu

Matières

CodeCloak : une méthode basée sur DRL pour atténuer les fuites de code par les assistants de code LLM

IA générative pour le TALN avec PyTorch

Ingénieur en apprentissage automatique : Modèles d'apprentissage automatique et profond

Préparation des données et apprentissage automatique appliqué

Création d'un assistant culinaire IA avec Django

Ingénierie des caractéristiques et magasins de caractéristiques pour l'IA et le ML

Ce que vous devez savoir avant
Vous commencez