What You Need to Know Before
You Start
Starts 5 June 2025 19:35
Ends 5 June 2025
00
days
00
hours
00
minutes
00
seconds
7 hours 47 minutes
Optional upgrade avallable
Not Specified
Progress at your own speed
Paid Course
Optional upgrade avallable
Overview
Are you ready to unlock the full potential of your data analytics pipelines? dbt on Databricks is a comprehensive course tailored for data professionals aiming to master data transformation using dbt (data build tool) on the Databricks platform, harnessing the power of Apache Spark for scalable and efficient workflows.
Syllabus
- Introduction to dbt and Databricks
- Setting Up Your Environment
- Fundamentals of dbt
- Advanced dbt Techniques
- Leveraging Apache Spark with dbt
- Implementing dbt in Databricks
- Data Quality and Testing
- Debugging and Optimization
- Use Cases and Real-world Applications
- Course Project
- Conclusion and Next Steps
Overview of dbt and its role in data transformation
Introduction to Databricks and Apache Spark
Installing dbt on Databricks
Configuring dbt profiles for Databricks
Understanding the dbt workflow
Writing basic dbt models
Utilizing macros and variables
Implementing tests and documentation
Strategies for model optimization
Using hooks and operations
Overview of Apache Spark architecture
Integrating Spark SQL with dbt models
Managing large datasets with Spark
Running dbt jobs in Databricks notebooks
Scheduling and orchestrating dbt runs in Databricks
Best practices for data testing in dbt
Automating tests on Databricks
Identifying and resolving performance bottlenecks
Profiling and optimizing queries with dbt and Spark
Case studies of dbt on Databricks implementations
Success stories from industry
Designing and implementing a data transformation pipeline using dbt on Databricks
Presentation and peer review of projects
Recap of key concepts
Resources for further learning and development
Taught by
Malvik Vaghadia
Subjects
Business