Spark Chapter 12 Spark Streaming with Apache Kafka

Previous blog/Context: In an earlier blog, we discussed Spark ETL with Lakehouse (All the famous lake house formats). Please find below blog post for more details. https://developershome.blog/2023/04/05/spark-etl-chapter-11-with-lakehouse-delta-table-optimization/ Introduction: Today, we will discuss the points below. What is Apache Kafka? Basic concepts of Apache Kafka (Publisher and Subscriber) Publish and subscribe messages from the command line... Continue Reading →

Data Engineering Problem 8 (Top distance travelled by rider)

Please find earlier blogs to have understanding of our Data Engineering Learning plan and system setup for Data Engineering. Today we are solving and learning one more Data Engineering problem and learning new concepts. For earlier problem solution and key learning points follow below. https://developershome.blog/category/data-engineering/problem-solving/ Problem Statement Find the top 10 users that have traveled... Continue Reading →

Create a website or blog at WordPress.com

Up ↑