在所有的NodeManager中,修改yarn-site.xml,为yarn.nodemanager.aux-services添加spark_shuffle值,并设置yarn.nodemanager.aux-services.spark_shuffle.class值为org.apache.spark.network.yarn.YarnShuffleService,如下:
<property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle,spark_shuffle<value> </property> <property> <name>yarn.nodemanager.aux-services.spark_shuffle.class</name> <value>org.apache.spark.network.yarn.YarnShuffleService</value> </property>
配置 $SPARK_HOME/conf/spark-default.xml,添加以下两项
spark.dynamicAllocation.minExecutors 1 #最小Executor数
spark.dynamicAllocation.maxExecutors 100 #最大Executor数
执行时开启自动调整Executor数开关,以spark-sql yarn client模式为例
spark-submit
--class SySpark.SqlOnSpark
--master yarn-client
--conf spark.shuffle.service.enabled=true
--conf spark.dynamicAllocation.enabled=true
/data/jars/SqlOnSpark.jar
"SELECT COUNT(*) FROM xx"