Overview
In the Data Engineering Capstone Project, participants will embark on a comprehensive journey, showcasing their expertise akin to that of a Professional Data Engineer. The core challenge lies in architecting, executing, and overseeing an integrated data analytics framework. This includes the development and management of both relational and non-relational databases, data warehouses, data conduits, expansive data processing apparatus, and Business Intelligence (BI) tools.
The Capstone mandates the application and honing of competencies and insights gained throughout the IBM Data Engineering Professional Certificate series. Enrollees will be tasked with designing databases, aggregating data from a plethora of sources, executing extract, transform, and load (ETL) operations into a data warehouse, and deploying a cloud-based BI instrument for generating analytical reports and visualizations. Additionally, the project involves the execution of predictive analytics and the creation of machine learning models employing big data methodologies and instruments.
A significant portion of the Capstone involves engaging in extensive hands-on lab activities. Participants will demonstrate their literacy and proficiency in utilizing an array of tools and technologies including Python, Bash scripts, SQL, NoSQL, RDBMSes (Relational Database Management Systems), ETL processes, MySQL, PostgreSQL, Db2, MongoDB, Apache Airflow, Apache Spark, and Cognos Analytics.
Completing this Capstone will leave participants well-prepared, with a robust portfolio demonstrating their capability to undertake real-world data engineering tasks, positioning them as competent entry-level data engineers. This project is part of the curriculum offered by edX under the categories of Python Courses, Big Data Courses, Business Intelligence Courses, Data Warehousing Courses, and Databases Courses.
Syllabus
Taught by
Rav Ahuja and Ramesh Sannareddy