Exception 1:当我们将任务提交给Spark Yarn集群时,大多会出现以下异常,如下:
14/08/09 11:45:32 WARN component.AbstractLifeCycle: FAILED SelectChannelConnector@0.0.0.0:4040: java.net.BindException: Address already in use java.net.BindException: Address already in use at sun.nio.ch.Net.bind0(Native Method) at sun.nio.ch.Net.bind(Net.java:444) at sun.nio.ch.Net.bind(Net.java:436)
Reason:端口被占用(In fact, 集群试图选择另外一个端口,所以该异常无大碍,Just Warning级别)
Solution: http://blog.csdn.net/sunflower_cao/article/details/37655873
Exception 2:
WARNYarnClientClusterScheduler: Initial job has not accepted any resources;check your cluster UI to ensure that workers are registered and havesufficient memory
Reason 1:
提交任务的节点不能和spark工作节点交互,因为提交完任务后提交任务节点上会起一个进程,展示任务进度,大多端口为4044,工作节点需要反馈进度给该端口,所以如果主机名或者IP在hosts中配置不正确,就会报上述error
Reason 2:
内存肯定是够的,但就是无法获取资源!检查防火墙,果然客户端只开启的对80端口的访问,其他都禁止了!
Solution:
关闭各节点上的防火墙(service iptables stop),然后在Spark on yarn集群上执行上述脚本runSpark.sh即可