Hadoop启动脚本分析
作者:尹正杰
版权声明:原创作品,谢绝转载!否则将追究法律责任。
能看到这篇博客的你估计对Hadoop已经有一个系统的了解了,最起码各种搭建方式你应该是会的,不会也没有关系,可以参考我的笔记,里面有各种搭建方式,哈哈哈~
[yinzhengjie@s101 ~]$ cat `which xcall.sh` #!/bin/bash #@author :yinzhengjie #blog:http://www.cnblogs.com/yinzhengjie #EMAIL:y1053419035@qq.com #判断用户是否传参 if [ $# -lt 1 ];then echo "请输入参数" exit fi #获取用户输入的命令 cmd=$@ for (( i=101;i<=104;i++ )) do #使终端变绿色 tput setaf 2 echo ============= s$i $cmd ============ #使终端变回原来的颜色,即白灰色 tput setaf 7 #远程执行命令 ssh s$i $cmd #判断命令是否执行成功 if [ $? == 0 ];then echo "命令执行成功" fi done [yinzhengjie@s101 ~]$
一.start-all.sh脚本分析
[yinzhengjie@s101 ~]$ cat `which start-all.sh` | grep -v ^# | grep -v ^$ echo "This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh" bin=`dirname "${BASH_SOURCE-$0}"` bin=`cd "$bin"; pwd` DEFAULT_LIBEXEC_DIR="$bin"/../libexec HADOOP_LIBEXEC_DIR=${HADOOP_LIBEXEC_DIR:-$DEFAULT_LIBEXEC_DIR} . $HADOOP_LIBEXEC_DIR/hadoop-config.sh if [ -f "${HADOOP_HDFS_HOME}"/sbin/start-dfs.sh ]; then "${HADOOP_HDFS_HOME}"/sbin/start-dfs.sh --config $HADOOP_CONF_DIR fi if [ -f "${HADOOP_YARN_HOME}"/sbin/start-yarn.sh ]; then "${HADOOP_YARN_HOME}"/sbin/start-yarn.sh --config $HADOOP_CONF_DIR fi [yinzhengjie@s101 ~]$
从这个脚本中的第一行我们可以看出来,这个脚本已经过时了,取而代之的是:“This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh”,也就是 “start-dfs.sh”和“start-yarn.sh”。
二.start-dfs.sh 脚本分析
[yinzhengjie@s101 ~]$ more `which start-dfs.sh` | grep -v ^# | grep -v ^$ usage="Usage: start-dfs.sh [-upgrade|-rollback] [other options such as -clusterId]" bin=`dirname "${BASH_SOURCE-$0}"` bin=`cd "$bin"; pwd` DEFAULT_LIBEXEC_DIR="$bin"/../libexec HADOOP_LIBEXEC_DIR=${HADOOP_LIBEXEC_DIR:-$DEFAULT_LIBEXEC_DIR} . $HADOOP_LIBEXEC_DIR/hdfs-config.sh if [[ $# -ge 1 ]]; then startOpt="$1" shift case "$startOpt" in -upgrade) nameStartOpt="$startOpt" ;; -rollback) dataStartOpt="$startOpt" ;; *) echo $usage exit 1 ;; esac fi nameStartOpt="$nameStartOpt $@" NAMENODES=$($HADOOP_PREFIX/bin/hdfs getconf -namenodes) echo "Starting namenodes on [$NAMENODES]" "$HADOOP_PREFIX/sbin/hadoop-daemons.sh" --config "$HADOOP_CONF_DIR" --hostnames "$NAMENODES" --script "$bin/hdfs" start namenode $nameStartOpt if [ -n "$HADOOP_SECURE_DN_USER" ]; then echo "Attempting to start secure cluster, skipping datanodes. " "Run start-secure-dns.sh as root to complete startup." else "$HADOOP_PREFIX/sbin/hadoop-daemons.sh" --config "$HADOOP_CONF_DIR" --script "$bin/hdfs" start datanode $dataStartOpt fi SECONDARY_NAMENODES=$($HADOOP_PREFIX/bin/hdfs getconf -secondarynamenodes 2>/dev/null) if [ -n "$SECONDARY_NAMENODES" ]; then echo "Starting secondary namenodes [$SECONDARY_NAMENODES]" "$HADOOP_PREFIX/sbin/hadoop-daemons.sh" --config "$HADOOP_CONF_DIR" --hostnames "$SECONDARY_NAMENODES" --script "$bin/hdfs" start secondarynamenode fi SHARED_EDITS_DIR=$($HADOOP_PREFIX/bin/hdfs getconf -confKey dfs.namenode.shared.edits.dir 2>&-) case "$SHARED_EDITS_DIR" in qjournal://*) JOURNAL_NODES=$(echo "$SHARED_EDITS_DIR" | sed 's,qjournal://([^/]*)/.*,1,g; s/;/ /g; s/:[0-9]*//g') echo "Starting journal nodes [$JOURNAL_NODES]" "$HADOOP_PREFIX/sbin/hadoop-daemons.sh" --config "$HADOOP_CONF_DIR" --hostnames "$JOURNAL_NODES" --script "$bin/hdfs" start journalnode ;; esac AUTOHA_ENABLED=$($HADOOP_PREFIX/bin/hdfs getconf -confKey dfs.ha.automatic-failover.enabled) if [ "$(echo "$AUTOHA_ENABLED" | tr A-Z a-z)" = "true" ]; then echo "Starting ZK Failover Controllers on NN hosts [$NAMENODES]" "$HADOOP_PREFIX/sbin/hadoop-daemons.sh" --config "$HADOOP_CONF_DIR" --hostnames "$NAMENODES" --script "$bin/hdfs" start zkfc fi [yinzhengjie@s101 ~]$
以上的注释已经被我过滤掉了,从这个脚本中大致可以看出这个脚本是用来启动hdfs进程的,即分别是:NameNode,DataNode以及secondaryNameNode。
1>.单独启动NameNode脚本用法如下:
[yinzhengjie@s101 ~]$ hadoop-daemon.sh --hostnames s101 start namenode starting namenode, logging to /soft/hadoop-2.7.3/logs/hadoop-yinzhengjie-namenode-s101.out [yinzhengjie@s101 ~]$ [yinzhengjie@s101 ~]$ xcall.sh jps ============= s101 jps ============ 11531 Jps 11453 NameNode 命令执行成功 ============= s102 jps ============ 3657 Jps 命令执行成功 ============= s103 jps ============ 3627 Jps 命令执行成功 ============= s104 jps ============ 3598 Jps 命令执行成功 [yinzhengjie@s101 ~]$
以上是单独启动NameNode节点的脚本用法,如果你想要批量启动的话可以使用hadoop-daemons.sh命令,只不过由于我部署的集群环境只有一个NameNode节点,因此看不出来有任何效果。
[yinzhengjie@s101 ~]$ hadoop-daemons.sh --hostnames ` hdfs getconf -namenodes` start namenode s101: starting namenode, logging to /soft/hadoop-2.7.3/logs/hadoop-yinzhengjie-namenode-s101.out [yinzhengjie@s101 ~]$ xcall.sh jps ============= s101 jps ============ 13395 Jps 13318 NameNode 命令执行成功 ============= s102 jps ============ 3960 Jps 命令执行成功 ============= s103 jps ============ 3930 Jps 命令执行成功 ============= s104 jps ============ 3899 Jps 命令执行成功 [yinzhengjie@s101 ~]$
2>.单独启动DataNode脚本如下:
[yinzhengjie@s101 ~]$ hadoop-daemon.sh start datanode starting datanode, logging to /soft/hadoop-2.7.3/logs/hadoop-yinzhengjie-datanode-s101.out [yinzhengjie@s101 ~]$ xcall.sh jps ============= s101 jps ============ 12119 Jps 12045 DataNode 命令执行成功 ============= s102 jps ============ 3779 Jps 命令执行成功 ============= s103 jps ============ 3750 Jps 命令执行成功 ============= s104 jps ============ 3719 Jps 命令执行成功 [yinzhengjie@s101 ~]$
以上是单独启动DataNode的脚本用法,想要执行如果你想要批量启动的话可以使用hadoop-daemons.sh命令,由于我有三个节点,看起来效果就很明显了。
[yinzhengjie@s101 ~]$ xcall.sh jps ============= s101 jps ============ 14482 Jps 命令执行成功 ============= s102 jps ============ 4267 Jps 命令执行成功 ============= s103 jps ============ 4238 Jps 命令执行成功 ============= s104 jps ============ 4206 Jps 命令执行成功 [yinzhengjie@s101 ~]$ hadoop-daemons.sh start datanode s102: starting datanode, logging to /soft/hadoop-2.7.3/logs/hadoop-yinzhengjie-datanode-s102.out s104: starting datanode, logging to /soft/hadoop-2.7.3/logs/hadoop-yinzhengjie-datanode-s104.out s103: starting datanode, logging to /soft/hadoop-2.7.3/logs/hadoop-yinzhengjie-datanode-s103.out [yinzhengjie@s101 ~]$ xcall.sh jps ============= s101 jps ============ 14552 Jps 命令执行成功 ============= s102 jps ============ 4386 Jps 4316 DataNode 命令执行成功 ============= s103 jps ============ 4357 Jps 4287 DataNode 命令执行成功 ============= s104 jps ============ 4325 Jps 4255 DataNode 命令执行成功 [yinzhengjie@s101 ~]$
3>.单独启动secondaryNameNode
[yinzhengjie@s101 ~]$ hadoop-daemon.sh --hostnames s101 start secondarynamenode starting secondarynamenode, logging to /soft/hadoop-2.7.3/logs/hadoop-yinzhengjie-secondarynamenode-s101.out [yinzhengjie@s101 ~]$ xcall.sh jps ============= s101 jps ============ 15127 SecondaryNameNode 15179 Jps 命令执行成功 ============= s102 jps ============ 4541 Jps 命令执行成功 ============= s103 jps ============ 4513 Jps 命令执行成功 ============= s104 jps ============ 4480 Jps 命令执行成功 [yinzhengjie@s101 ~]$
以上是单独启动secondaryNameNode的脚本用法,想要执行如果你想要批量启动的话可以使用hadoop-daemons.sh命令,由于我有三个节点,看起来效果就很明显了。
[yinzhengjie@s101 ~]$ xcall.sh jps ============= s101 jps ============ 17273 Jps 命令执行成功 ============= s102 jps ============ 4993 Jps 命令执行成功 ============= s103 jps ============ 4965 Jps 命令执行成功 ============= s104 jps ============ 4929 Jps 命令执行成功 [yinzhengjie@s101 ~]$ [yinzhengjie@s101 ~]$ [yinzhengjie@s101 ~]$ for i in `cat /soft/hadoop/etc/hadoop/slaves | grep -v ^#` ;do hadoop-daemons.sh --hostnames $i start secondarynamenode ;done s102: starting secondarynamenode, logging to /soft/hadoop-2.7.3/logs/hadoop-yinzhengjie-secondarynamenode-s102.out s103: starting secondarynamenode, logging to /soft/hadoop-2.7.3/logs/hadoop-yinzhengjie-secondarynamenode-s103.out s104: starting secondarynamenode, logging to /soft/hadoop-2.7.3/logs/hadoop-yinzhengjie-secondarynamenode-s104.out [yinzhengjie@s101 ~]$ xcall.sh jps ============= s101 jps ============ 17394 Jps 命令执行成功 ============= s102 jps ============ 5089 Jps 5042 SecondaryNameNode 命令执行成功 ============= s103 jps ============ 5061 Jps 5014 SecondaryNameNode 命令执行成功 ============= s104 jps ============ 5026 Jps 4979 SecondaryNameNode 命令执行成功 [yinzhengjie@s101 ~]$
三.start-yarn.sh 脚本分析
[yinzhengjie@s101 ~]$ cat /soft/hadoop/sbin/start-yarn.sh | grep -v ^# | grep -v ^$ echo "starting yarn daemons" bin=`dirname "${BASH_SOURCE-$0}"` bin=`cd "$bin"; pwd` DEFAULT_LIBEXEC_DIR="$bin"/../libexec HADOOP_LIBEXEC_DIR=${HADOOP_LIBEXEC_DIR:-$DEFAULT_LIBEXEC_DIR} . $HADOOP_LIBEXEC_DIR/yarn-config.sh "$bin"/yarn-daemon.sh --config $YARN_CONF_DIR start resourcemanager "$bin"/yarn-daemons.sh --config $YARN_CONF_DIR start nodemanager [yinzhengjie@s101 ~]$
其实用法跟上面的类似,单独启动进程如下:
[yinzhengjie@s101 ~]$ xcall.sh jps ============= s101 jps ============ 18290 Jps 命令执行成功 ============= s102 jps ============ 5314 Jps 命令执行成功 ============= s103 jps ============ 5288 Jps 命令执行成功 ============= s104 jps ============ 5249 Jps 命令执行成功 [yinzhengjie@s101 ~]$ [yinzhengjie@s101 ~]$ [yinzhengjie@s101 ~]$ yarn-daemon.sh start nodemanager starting nodemanager, logging to /soft/hadoop-2.7.3/logs/yarn-yinzhengjie-nodemanager-s101.out [yinzhengjie@s101 ~]$ xcall.sh jps ============= s101 jps ============ 18344 NodeManager 18474 Jps 命令执行成功 ============= s102 jps ============ 5337 Jps 命令执行成功 ============= s103 jps ============ 5311 Jps 命令执行成功 ============= s104 jps ============ 5273 Jps 命令执行成功 [yinzhengjie@s101 ~]$
如果想要想要批量启动的,实操如下:
[yinzhengjie@s101 ~]$ xcall.sh jps ============= s101 jps ============ 18570 Jps 命令执行成功 ============= s102 jps ============ 5383 Jps 命令执行成功 ============= s103 jps ============ 5357 Jps 命令执行成功 ============= s104 jps ============ 5319 Jps 命令执行成功 [yinzhengjie@s101 ~]$ yarn-daemons.sh start nodemanager s102: starting nodemanager, logging to /soft/hadoop-2.7.3/logs/yarn-yinzhengjie-nodemanager-s102.out s104: starting nodemanager, logging to /soft/hadoop-2.7.3/logs/yarn-yinzhengjie-nodemanager-s104.out s103: starting nodemanager, logging to /soft/hadoop-2.7.3/logs/yarn-yinzhengjie-nodemanager-s103.out [yinzhengjie@s101 ~]$ xcall.sh jps ============= s101 jps ============ 18645 Jps 命令执行成功 ============= s102 jps ============ 5562 Jps 5436 NodeManager 命令执行成功 ============= s103 jps ============ 5536 Jps 5410 NodeManager 命令执行成功 ============= s104 jps ============ 5498 Jps 5372 NodeManager 命令执行成功 [yinzhengjie@s101 ~]$
二.stop-all.sh脚本分析
[yinzhengjie@s101 ~]$ cat `which stop-all.sh` | grep -v ^# | grep -v ^$ echo "This script is Deprecated. Instead use stop-dfs.sh and stop-yarn.sh" bin=`dirname "${BASH_SOURCE-$0}"` bin=`cd "$bin"; pwd` DEFAULT_LIBEXEC_DIR="$bin"/../libexec HADOOP_LIBEXEC_DIR=${HADOOP_LIBEXEC_DIR:-$DEFAULT_LIBEXEC_DIR} . $HADOOP_LIBEXEC_DIR/hadoop-config.sh if [ -f "${HADOOP_HDFS_HOME}"/sbin/stop-dfs.sh ]; then "${HADOOP_HDFS_HOME}"/sbin/stop-dfs.sh --config $HADOOP_CONF_DIR fi if [ -f "${HADOOP_HDFS_HOME}"/sbin/stop-yarn.sh ]; then "${HADOOP_HDFS_HOME}"/sbin/stop-yarn.sh --config $HADOOP_CONF_DIR fi [yinzhengjie@s101 ~]$
看到第一行时:echo "This script is Deprecated. Instead use stop-dfs.sh and stop-yarn.sh",估计你已经明白是怎么回事了把,就是把上面的所有start参数换成了stop参数。从这个脚本中的第一行我们可以看出来,这个脚本已经过时了,取而代之的是:“This script is Deprecated. Instead use stop-dfs.sh and stop-yarn.sh”,也就是 “stop-dfs.sh”和“stop-yarn.sh”。
三.小结
综上所述,我们可以得到以下四个等式:
1>.start-all.sh = start-dfs.sh + start-yarn.sh
2>.stop-all.sh = stop-dfs.sh + stop-yarn.sh
3>.hadoop-damons.sh = hadoop-damon.sh + slaves
4>.yarn-damons.sh = yarn-damon.sh + slaves