top of page

PySpark - Mac Terminal Commands


Below are the basic Mac Terminal Commands to set up Anaconda, PySpark and Java on Mac for Jupyter or Colaboratory.


1. Revise in Terminal your Python version


python


Python 3.9.7 (default, Sep 16 2021, 08:50:36)


2. Download and install anaconda for Mac related to your Python version


2.1. from Web Browser


https://www.anaconda.com/products/individual


2.2. from Mac Terminal


bash ~/Downloads/Anaconda3-2019.03-MacOSX-x86_64.sh


3. Restart your Terminal and revise the installed files


conda list


# packages in environment at /opt/anaconda3:

#

# Name Version Build Channel

_ipyw_jlab_nb_ext_conf 0.1.0 py39hecd8cb5_0

alabaster 0.7.12 pyhd3eb1b0_0

anaconda 2021.11 py39_0


4. Control the opt anaconda folder and revise that it has a spark


cd opt

cd anaconda3

touch hello-spark.yml

vi hello-spark.yml

source activate hello-spark

conda create --name myenv python=3.9

conda activate myenv


5. Revise Java version


java -version

java version "1.8.0_321"


ls /Library/Java/JavaVirtualMachines


6. Download and Revise Java version


http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html


/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"

/bin/bash -c $(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install.sh)

brew tap homebrew/cask-versions

brew install brew-cask

brew install brew-cask-completion

brew doctor

sudo xcode-select --install

brew install --cask adoptopenjdk8

brew install --cask homebrew/cask-versions/adoptopenjdk8

brew tap adoptopenjdk/openjdk

conda env remove -n hello-spark -y


7. Other installations in /usr/local/bin/


brew install apache-spark

apache-spark 3.2.1 is already installed and up-to-date.

brew install python3

pip3 install pyspark

brew install pyenv

brew install scala

Using Scala version 2.12.15 (OpenJDK 64-Bit Server VM, Java 11.0.12)

brew install sbt

brew info apache-spark

/usr/local/Cellar/apache-spark/3.2.1

conda install -c conda-forge findspark

conda install -c conda-forge pyspark

python -m pip install findspark

Requirement already satisfied: findspark in /opt/anaconda3/lib/python3.9/site-packages (2.0.1)

pip install pyspark

Requirement already satisfied: pyspark in /opt/anaconda3/lib/python3.9/site-packages (3.2.1)

Requirement already satisfied: py4j==0.10.9.3 in /opt/anaconda3/lib/python3.9/site-packages (from pyspark) (0.10.9.3)

jupyter notebook --profile=pyspark


8.Declarations


export JAVA_HOME=/Library/java/JavaVirtualMachines/adoptopenjdk-8.jdk/contents/Home/

export JAVA_HOME=/usr/local/Cellar/openjdk@11/11.0.12/libexec/openjdk.jdk/Contents/Home

export JRE_HOME=/Library/java/JavaVirtualMachines/openjdk-13.jdk/contents/Home/jre/


export SPARK_HOME=/usr/local/Cellar/apache-spark/3.2.1/libexec

export PATH=/usr/local/Cellar/apache-spark/3.2.1/bin:$PATH

export PYSPARK_PYTHON=/Library/Frameworks/Python.framework/Versions/3.9/bin/python3

export PYSPARK_DRIVER_PYTHON=jupyter

export PYSPARK_DRIVER_PYTHON_OPTS='notebook'


which python3

/Library/Frameworks/Python.framework/Versions/3.9/bin/python3

which python

/opt/anaconda3/bin/python


9. PySpark


9.1.Terminal Confirmation


spark-shell


9.2. Spark Shell Application User interface - Jobs, Stages, Storage, Environment, Executors


http://xxxxxxxxxxxxx-imac.home:4040



https://www.guru99.com/pyspark-tutorial.html

https://medium.com/swlh/pyspark-on-macos-installation-and-use-31f84ca61400

https://stackoverflow.com/questions/63216201/how-to-install-python3-9-with-conda

https://docs.anaconda.com/anaconda-scale/howto/spark-configuration/#scale-spark-config-sparkcontext

https://docs.datastax.com/en/jdk-install/doc/jdk-install/installOpenJdkDeb.html

https://docs.anaconda.com/anaconda-scale/howto/spark-configuration/

https://www.dataquest.io/blog/pyspark-installation-guide/

https://notadatascientist.com/install-spark-on-macos/


12 views0 comments

Recent Posts

See All

Python - Basic regression comparison

Regression models are the principles of machine learning models as well. They help to understand the dataset distributions. The objective...

댓글


bottom of page