Overview
Join our comprehensive course on Building ETL and Data Pipelines with Bash, Airflow, and Kafka, offered by edX. This course is designed to equip you with essential knowledge and practical skills in data engineering and warehousing, focusing on well-designed and automated data pipelines and ETL processes, which are crucial for a thriving Business Intelligence platform.
Discover how to define efficient data workflows, pipelines, and processes from the outset. Learn to ensure that the right raw data is collected, transformed, and loaded into the desired storage layers, making it readily available for analysis. This ability is paramount for early-stage platform design, ensuring robust data handling and business intelligence strategy.
By the end of this course, you will have a firm grasp of both Extract, Transform, Load (ETL) and Extract, Load, and Transform (ELT) processes. You'll gain practical experience in extracting, transforming, and loading data into a staging area. Enhance your skills by creating an ETL data pipeline using Bash shell-scripting, constructing a batch ETL workflow using Apache Airflow, and developing a streaming data pipeline using Apache Kafka.
Through practice labs and a real-world inspired project, you'll build various data pipelines utilizing these technologies, significantly boosting your portfolio and demonstrating your capability as a Data Engineer. This course is recommended for those with prior experience in working with datasets, SQL, relational databases, and Bash shell scripts.
Categorized under Big Data Courses, Apache Airlick Courses, and Apache Kafka Courses, this educational journey is aimed to mold proficient Data Engineers and Data Warehousing specialists ready to handle complex data environments.
Syllabus
Taught by
Rav Ahuja, Yan Luo and Jeff Grossman