End-to-End Lakehouse Implementation using Delta Lake Photo by Jacob Bentzinger on Unsplash Data is one of the most valuable assets for businesses today, but managing and processing large volumes of data can be a complex and challenging task. Traditional data lakes and big data frameworks offer scalable storage and processing capabilities, but they often lack critical... Continue Reading →
Spark ETL Chapter 11 with Lakehouse (Delta table Optimization)
Previous blog/Context: In an earlier blog, we discussed Spark ETL with Lakehouse (All the famous lake house formats). Please find below blog post for more details. https://developershome.blog/2023/04/04/spark-etl-chapter-10-with-lakehouse-delta-lake-vs-apache-iceberg-vs-apache-hudi/ Introduction: Today, we will discuss the points below. Load data into Delta table & check performance by executing queries. Load data into Delta table with partitioning & check... Continue Reading →
Spark ETL Chapter 10 with Lakehouse (Delta Lake vs Apache Iceberg vs Apache HUDI)
Previous blog/Context: In an earlier blog, we discussed Spark ETL with Lakehouse (with Apache Iceberg). Please find below blog post for more details. https://developershome.blog/2023/03/21/spark-etl-chapter-9-with-lakehouse-apache-iceberg/ Introduction: Today, In this below, we will discuss below points Spark ETL with famous Lakehouse formats. (Delta Lake, Apache Iceberg, and Apache HUDI) Offerings from all these lake house data formats.... Continue Reading →