Spark Chapter 12 Spark Streaming with Apache Kafka

Previous blog/Context: In an earlier blog, we discussed Spark ETL with Lakehouse (All the famous lake house formats). Please find below blog post for more details. https://developershome.blog/2023/04/05/spark-etl-chapter-11-with-lakehouse-delta-table-optimization/ Introduction: Today, we will discuss the points below. What is Apache Kafka? Basic concepts of Apache Kafka (Publisher and Subscriber) Publish and subscribe messages from the command line... Continue Reading →

Spark ETL Chapter 11 with Lakehouse (Delta table Optimization)

Previous blog/Context: In an earlier blog, we discussed Spark ETL with Lakehouse (All the famous lake house formats). Please find below blog post for more details. https://developershome.blog/2023/04/04/spark-etl-chapter-10-with-lakehouse-delta-lake-vs-apache-iceberg-vs-apache-hudi/ Introduction: Today, we will discuss the points below. Load data into Delta table & check performance by executing queries. Load data into Delta table with partitioning & check... Continue Reading →

Create a website or blog at WordPress.com

Up ↑