Spark
Spark
This is an extension of my 2025 Learning Log.
Reviewing Spark (PySpark) through this course Taming Big Data With Apache Spark
Setting up
Apache Spark 3.x is only compatible with Java 8, Java 11, or Java 17, and Apache Spark 4 is only compatible with Java 17.
Currently, Spark is not compatible with Python 3.12 or newer.
So I needed to install alternative lower versions of Java and Python.
Install Java 11
1
brew install openjdk@11
Make sure that it is the default Java in the system
1
2
3
4
5
6
cd /Library/Java/JavaVirtualMachines/jdk-23.jdk/Contents
mv Info.plist Info.plist.disabled
sudo ln -sfn /opt/homebrew/opt/openjdk\@11/libexec/openjdk.jdk /Library/Java/JavaVirtualMachines/openjdk.jdk
export JAVA_HOME=`/usr/libexec/java_home -v 11`
java -version
Downgrade Python
1
brew install python@3.10
Create a virtual environment
1
2
3
4
5
6
python3.10 -m pip3 install virtualenv
python3.10 -m virtualenv .venv
source .venv/bin/activate
# test
pyspark
To test
1
spark-submit test.py
This post is licensed under CC BY 4.0 by the author.