Qué necesitas saber antes de
comenzar

Inicio 14 July 2026 10:39

Fin 14 July 2026

00 Días

00 Horas

00 Minutos

00 Segundos

Data Engineering on AWS - A Streaming Data Pipeline Solution (Amazonian)

Ingeniería de Datos en AWS - Una Solución de Pipeline de Datos en Streaming (Amazónico) En este curso, aprenderás a construir soluciones de análisis de datos en streaming utilizando servicios de AWS, incluidos Amazon Kinesis, Amazon Data Firehose y Amazon Managed Streaming for Apache Kafka (Amazon MSK). Kinesis es un servicio de transmisión de dat.

via AWS Skill Builder

No especificado

Actualización opcional disponible

Todos los niveles

Avanza a tu propio ritmo

Free

Actualización opcional disponible

Resumen

In this course, you will learn to build a streaming data analytics solutions using AWS services, including Amazon Kinesis, Amazon Data Firehose, and Amazon Managed Streaming for Apache Kafka (Amazon MSK). Kinesis is a massively scalable and durable real-time data streaming service.

Amazon MSK offers a secure, fully managed, and highly available Apache Kafka service.

You will learn how Kinesis and Amazon MSK integrate with AWS services such as AWS Glue and AWS Lambda. The course addresses the streaming data ingestion, stream storage, and stream processing components of the data analytics pipeline.

You will also learn to apply security, performance, and cost management best practices to the operation of Kinesis and Amazon MSK.

The course is divided into different modules. The learning modules introduce new concepts and the AWS services you can use to build your solution.

Lab modules are in-depth, hands-on activities with step-by-step instructions for you to apply what you’ve learned.

Activities

Interactive content, videos, knowledge checks, assessments, and hands-on labs

Course objectives

Recognize an analytics customer challenge and describe the appropriate AWS solution for solving it featuring a streaming data architecture.
Describe data sources suitable for streaming applications and how that data is ingested.
Identify short-term and long-term storage services for streaming data.
Describe how to design and implement real-time data processing solutions.
Recognize how to serve streaming data for consumption by end users.
Describe how to optimize a streaming data pipeline using Amazon Kinesis, Amazon MSK, and Amazon Redshift.
Identify best practices for securing a streaming data pipeline.

Intended audience

Data engineer
Data analyst
Data architect
Business intelligence engineer

Recommended skills

2-3 years of experience in data engineering
1–2 years of hands-on experience with AWS services
Completed AWS Cloud Practitioner Essentials or equivalent
Completed Fundamentals of Analytics on AWS Part 1 and 2
Completed Data Engineering on AWS – Foundations

Course outline

Module 1:

Building a Streaming Data Pipeline Solution

This course shows how to identify, select, and configure the appropriate AWS services for building a streaming data pipeline solution to meet a fictitious customer's business goals.

Introduction
Ingesting Data from Stream Sources
Storing Streaming Data
Processing Data
Analyzing Data
Final Assessment
Conclusion

Module 2:

Streaming Analytics with Amazon Managed Service for Apache Flink (Lab)

This lab is a step-by-step, hands-on activity to build a stream processing pipeline by ingesting clickstream data and enriching the clickstream data with catalog data stored in Amazon Simple Storage Service (Amazon S3). You perform analysis on the enriched data to identify the sales per category in real time and visualize the output.

Lab overview
Task 1:
Setting up Zeppelin notebook environment
Task 2:
Connect to the Amazon EC2 producer and start the clickstream generator
Task 3:
Import the Zeppelin notebook
Task 4:
Analytics development in Managed Apache Flink Studio with Zeppelin notebook
Task 5:
Understanding in-memory table creation in AWS Glue Data Catalog
Conclusion

Module 3:

Optimizing and Securing a Streaming Data Pipeline Solution

This course covers how to configure a fictitious customer's streaming data pipeline solution to increase efficiency, control costs, secure and protect the data, and govern the infrastructure.

Qué necesitas saber antes de comenzar

Data Engineering on AWS - A Streaming Data Pipeline Solution (Amazonian)

No especificado

Todos los niveles

Free

Resumen

Materias

CodeCloak: Un método basado en DRL para mitigar la fuga de código por asistentes de código LLM

IA generativa para PLN con PyTorch

Ingeniero de Aprendizaje Automático: Modelos de ML y Aprendizaje Profundo

Preparación de Datos y Aprendizaje Automático Aplicado

Construyendo un Asistente de Cocina con IA usando Django

Ingeniería de características y almacenes de características para IA y ML

Qué necesitas saber antes de
comenzar