What You Need to Know Before
You Start
Starts 7 June 2025 18:28
Ends 7 June 2025
00
days
00
hours
00
minutes
00
seconds
26 minutes
Optional upgrade avallable
Not Specified
Progress at your own speed
Free Video
Optional upgrade avallable
Overview
Discover how data lakehouses combine performance and affordability, exploring Apache Arrow, Iceberg, and Project Nessie as alternatives to traditional data warehouses.
Syllabus
- Introduction to Data Lakehouses
- Core Components of Data Lakehouses
- Apache Arrow
- Apache Iceberg
- Project Nessie
- Comparing Open Source Data Lakehouse Technologies
- Practical Considerations and Best Practices
- Future Trends and Developments in Data Lakehouses
- Course Review and Final Thoughts
Definition and key characteristics
Comparison with data warehouses and data lakes
Benefits and limitations of data lakehouses
Storage and compute separation
Metadata management
Query engines and optimization
Overview of Apache Arrow
In-memory columnar format
Performance benefits for data lakehouses
Integration with other data technologies
Introduction to Apache Iceberg
Architecture and features
Advantages over traditional table formats
Use cases and implementation examples
Overview of Project Nessie
Version control for data lakehouses
Branching, merging, and reproducibility
Ecosystem and integration
Use cases and performance comparisons
Cost and affordability analysis
Case studies of successful implementations
Data governance and security
Performance optimization strategies
Choosing the right components for specific needs
Emerging technologies and innovations
Industry adoption and evolution
Speculations on future directions in data management
Recap of key concepts and technologies
Discussion on the impact of data lakehouses in the industry
Q&A and interactive discussions
Subjects
Business