Previous blog/Context: In an earlier blog, we discussed Spark ETL with NoSQL Databases (MongoDB Database). Please find below blog post for more details. https://developershome.blog/2023/03/07/spark-etl-chapter-2-with-nosql-database-mongodb-cassandra/ Introduction: In this blog, we will discuss Spark ETL with Cloud data lakes and we will be doing Spark ETL with Azure Blob storage. We will use public blob storage and... Continue Reading →
Spark ETL Chapter 2 with NoSQL Database (MongoDB | Cassandra)
Previous blog/Context: In an earlier blog, we discussed Spark ETL with SQL Databases (MySQL and PostgreSQL Database). Please find below blog post for more details. https://developershome.blog/2023/03/06/spark-etl-with-sql-databases-mysql-postgresql/ Introduction: In this blog, we will discuss Spark ETL with NoSQL database, and we are considering MongoDB and we will do all the Spark ETL with MongoDB database. All... Continue Reading →
Spark Installing external Packages
Introduction In this blog, we will discuss how to install external packages in Spark. We will discuss about how to install packages using below different ways From Jupyter notebook From terminal using Py Spark From terminal during submitting jobs (spark submit) Pre-installed Packages With Spark installation, we already have a few packages installed. It depends... Continue Reading →