Hadoop启动Shell分析
启动顺序
使用$HADOOP_HOME/start-all.sh启动Hadoop守护进程
Warning: $HADOOP_HOME is deprecated.
starting namenode, logging to /opt/modules/hadoop-1.2.1/libexec/../logs/hadoop-thread-namenode-thread.out
thread.com: Warning: $HADOOP_HOME is deprecated.
thread.com:
thread.com: starting datanode, logging to /opt/modules/hadoop-1.2.1/libexec/../logs/hadoop-thread-datanode-thread.out
thread.com: Warning: $HADOOP_HOME is deprecated.
thread.com:
thread.com: starting secondarynamenode, logging to /opt/modules/hadoop-1.2.1/libexec/../logs/hadoop-thread-secondarynamenode-thread.out
starting jobtracker, logging to /opt/modules/hadoop-1.2.1/libexec/../logs/hadoop-thread-jobtracker-thread.out
thread.com: Warning: $HADOOP_HOME is deprecated.
thread.com:
thread.com: starting tasktracker, logging to /opt/modules/hadoop-1.2.1/libexec/../logs/hadoop-thread-tasktracker-thread.out
根据提示信息可以看到启动顺序为:
1. NameNode
2. DataNode
3. SecondaryNode
4. JobTracker
5. TaskTracker
输入$HADOOP_HOME/start-all.sh关闭Hadoop守护进程
stopping jobtracker
thread.com: Warning: $HADOOP_HOME is deprecated.
thread.com:
thread.com: stopping tasktracker
stopping namenode
thread.com: Warning: $HADOOP_HOME is deprecated.
thread.com:
thread.com: no datanode to stop
thread.com: Warning: $HADOOP_HOME is deprecated.
thread.com:
thread.com: stopping secondarynamenode
关闭顺序:
1. JobTracker
2. TaskTracker
3. NameNode
4. DataNode
5. SecondaryNameNode
start-all.sh启动
start-all.sh部分源码
bin=`dirname "$0"`
bin=`cd "$bin"; pwd`
if [ -e "$bin/../libexec/hadoop-config.sh" ];
...
# start dfs daemons
"$bin"/start-dfs.sh --config $HADOOP_CONF_DIR
# start mapred daemons
"$bin"/start-mapred.sh --config $HADOOP_CONF_DIR
分析结论如下:
- bin=
dirname "$0"
. 可以自定义目录,设置自定义文件位置。 - 启动顺序为 HDFS->MapReduce
HDFS和MapReduce分别启动
$HADOOP_HOME/bin目录下,启动顺序为:
- start-dfs.sh
- start-mapred.sh
start-dfs.sh部分源码
# Start hadoop dfs daemons.
# Optinally upgrade or rollback dfs state.
# Run this on master node.
usage="Usage: start-dfs.sh [-upgrade|-rollback]"
...
# start dfs daemons
# start namenode after datanodes, to minimize time namenode is up w/o data
# note: datanodes will log connection errors until namenode starts
"$bin"/hadoop-daemon.sh --config $HADOOP_CONF_DIR start namenode $nameStartOpt
"$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR start datanode $dataStartOpt
"$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR --hosts masters start secondarynamenode
分析结论如下:
- Optinally upgrade or rollback dfs state.通过参数可以选择更新/回滚。usage=”Usage: start-dfs.sh [-upgrade|-rollback]”
- Run this on master node.用于masterNode的启动。
- NameNode需要在DataNode之前启动,NameNode可能在这段时间内写入/输出数据。否则会记录错误直到NameNode启动。
- 启动顺序为 NameNode->DataNode->SecondaryNameNode
- 调用hadoop-demon.sh和hadoop-demons.sh启动守护进程
start-mapred.sh部分源码
# Start hadoop map reduce daemons. Run this on master node.
...
# start mapred daemons
# start jobtracker first to minimize connection errors at startup
"$bin"/hadoop-daemon.sh --config $HADOOP_CONF_DIR start jobtracker
"$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR start tasktracker
分析结论如下:
- JobTrack满足最小连接数,否则报错。
- 启动顺序 JobTracker->TaskTracker
start-deamon启动
启动顺序可以根据上述得出 NameNode->DataNode->SecondaryNameNode->JobTracker->TaskTracker
使用到start-deamon.sh和start-deamons.sh
start-deamon.sh部分源码
usage="Usage: hadoop-daemon.sh [--config <conf-dir>] [--hosts hostlistfile] (start|stop) <hadoop-command> <args...>"
...
export HADOOP_LOGFILE=hadoop-$HADOOP_IDENT_STRING-$command-$HOSTNAME.log
分析结论如下:
- hadoop-daemon.sh [–config ] [–hosts hostlistfile] (start|stop)