Previous blog/Context: In an earlier blog, we discussed Spark ETL with Lakehouse (All the famous lake house formats). Please find below blog post for more details. https://developershome.blog/2023/04/05/spark-etl-chapter-11-with-lakehouse-delta-table-optimization/ Introduction: Today, we will discuss the points below. What is Apache Kafka? Basic concepts of Apache Kafka (Publisher and Subscriber) Publish and subscribe messages from the command line... Continue Reading →
Spark ETL Chapter 8 with Lakehouse | Apache HUDI
Previous blog/Context: In an earlier blog, we discussed Spark ETL with Lakehouse (with Delta Lake). Please find below blog post for more details. https://developershome.blog/2023/03/19/spark-etl-chapter-7-with-lakehouse-delta-lake/embed/#?secret=Z8M19UjerD#?secret=yljQcLJrZC Introduction: In this blog, we will discuss Spark ETL with Apache HUDI. We will first understand what Apache HUDI is and why Apache HUDI is used for creating Lake house. We... Continue Reading →
Spark ETL Chapter 7 with Lakehouse | Delta Lake
Previous blog/Context: In an earlier blog, we discussed Spark ETL with API. Please find below blog post for more details https://developershome.blog/2023/03/18/spark-etl-chapter-6-with-apis/ Introduction: In this blog, we will discuss Spark ETL with lake house. We will first understand what a lake house is and why we need a lakehouse and what are the formats for storing... Continue Reading →