What You Need to Know Before
You Start

Starts 7 June 2025 18:28

Ends 7 June 2025

00 days
00 hours
00 minutes
00 seconds
course image

Open Source and the Data Lakehouse - Understanding Components and Technologies

Discover how data lakehouses combine performance and affordability, exploring Apache Arrow, Iceberg, and Project Nessie as alternatives to traditional data warehouses.
OSACon via YouTube

OSACon

2544 Courses


26 minutes

Optional upgrade avallable

Not Specified

Progress at your own speed

Free Video

Optional upgrade avallable

Overview

Discover how data lakehouses combine performance and affordability, exploring Apache Arrow, Iceberg, and Project Nessie as alternatives to traditional data warehouses.

Syllabus

  • Introduction to Data Lakehouses
  • Definition and key characteristics
    Comparison with data warehouses and data lakes
    Benefits and limitations of data lakehouses
  • Core Components of Data Lakehouses
  • Storage and compute separation
    Metadata management
    Query engines and optimization
  • Apache Arrow
  • Overview of Apache Arrow
    In-memory columnar format
    Performance benefits for data lakehouses
    Integration with other data technologies
  • Apache Iceberg
  • Introduction to Apache Iceberg
    Architecture and features
    Advantages over traditional table formats
    Use cases and implementation examples
  • Project Nessie
  • Overview of Project Nessie
    Version control for data lakehouses
    Branching, merging, and reproducibility
    Ecosystem and integration
  • Comparing Open Source Data Lakehouse Technologies
  • Use cases and performance comparisons
    Cost and affordability analysis
    Case studies of successful implementations
  • Practical Considerations and Best Practices
  • Data governance and security
    Performance optimization strategies
    Choosing the right components for specific needs
  • Future Trends and Developments in Data Lakehouses
  • Emerging technologies and innovations
    Industry adoption and evolution
    Speculations on future directions in data management
  • Course Review and Final Thoughts
  • Recap of key concepts and technologies
    Discussion on the impact of data lakehouses in the industry
    Q&A and interactive discussions

Subjects

Business