Previous blog/Context: In an earlier blog, we discussed Spark ETL with files (CSV, JSON, Text, Paraquet and ORC). Please find below blog post for more details. https://developershome.blog/2023/03/02/spark-etl-chapter-0-with-files-csv-json-parquet-orc/ Introduction: In this blog post, we will discuss Spark ETL with SQL Database. We will be considering MySQL and PostgreSQL for Spark ETL. All other SQL Databases like... Continue Reading →
Data Engineering Problem 8 (Top distance travelled by rider)
Please find earlier blogs to have understanding of our Data Engineering Learning plan and system setup for Data Engineering. Today we are solving and learning one more Data Engineering problem and learning new concepts. For earlier problem solution and key learning points follow below. https://developershome.blog/category/data-engineering/problem-solving/ Problem Statement Find the top 10 users that have traveled... Continue Reading →
Data Engineering Problem 7 (Ebay Returning active users)
Please find earlier blogs to have understanding of our Data Engineering Learning plan and system setup for Data Engineering. Today we are solving and learning one more Data Engineering problem and learning new concepts. For earlier problem solution and key learning points follow below. https://developershome.blog/category/data-engineering/problem-solving/ Problem Statement Write a query that'll identify returning active users.... Continue Reading →