1. 作业提交方法以及参数
我们先看一下用Spark Submit提交的方法吧,下面是从官方上面摘抄的内容。
# Run application locally on 8 cores ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master local[8] /path/to/examples.jar 100 # Run on a Spark standalone cluster ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master spark://207.184.161.138:7077 --executor-memory 20G --total-executor-cores 100 /path/to/examples.jar 1000 # Run on a YARN cluster export HADOOP_CONF_DIR=XXX ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn-cluster # can also be `yarn-client` for client mode --executor-memory 20G --num-executors 50 /path/to/examples.jar 1000 # Run a Python application on a cluster ./bin/spark-submit --master spark://207.184.161.138:7077 examples/src/main/python/pi.py 1000