使用ansible控制Hadoop服务的启动和停止

使用ansible控制Hadoop服务的启动和停止

一、环境：

服务器一台，已安装centos7.5系统，做ansible服务器；

客户机三台：hadoop-master（192.168.1.18）、hadoop-slave1（192.168.1.19）、hadoop-slave2（192.168.1.20）

二、ansible软件安装：

[root@centos75 ~]# yum install ansible

三、ansible配置过程：

1、服务器与客户机之间的免密配置：

(1)生成密钥： ssh-keygen -t rsa

(2)传递密钥：

[root@centos75 ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub root@192.168.1.18

[root@centos75 ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub root@192.168.1.19

[root@centos75 ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub root@192.168.1.20

2、ansible配置

(1)Inventory主机清单配置：

[root@centos75 ~]# vi /etc/ansible/hosts

...

[hadoop]

192.168.1.[18:20] #这是一种IP地址段表示方式，也可单列每个IP地址。

(2)配置ansible.cfg：

[root@centos75 ~]# vi /etc/ansible/ansible.cfg

...

host_key_checking = False #禁用每次执行ansbile命令检查ssh key host

...

log_path = /var/log/ansible.log #开启日志记录

...

[accelerate] #ansible连接加速配置

#accelerate_port = 5099

accelerate_port = 10000

...

accelerate_multi_key = yes

...

deprecation_warnings = False #屏蔽弃用告警提示，减少不必要的信息显示

...

四、测试

[root@centos75 ~]# ansible all -m ping

192.168.1.20 | SUCCESS => {

"changed": false,

"ping": "pong"

}

192.168.1.18 | SUCCESS => {

"changed": false,

"ping": "pong"

}

192.168.1.19 | SUCCESS => {

"changed": false,

"ping": "pong"

}

上述信息表明ansible管理对象已全部ping通，ansible配置正常。

五、使用示例

(1) Ad-Hoc模式：

修改Hadoop三台集群服务器的/etc/hosts文件：

[root@centos75 ~]# vi hosts

#127.0.1.1 hadoop-master

192.168.1.18 hadoop-master

192.168.1.19 hadoop-slave1

192.168.1.20 hadoop-slave2

# The following lines are desirable for IPv6 capable hosts

::1 ip6-localhost ip6-loopback

fe00::0 ip6-localnet

ff00::0 ip6-mcastprefix

ff02::1 ip6-allnodes

ff02::2 ip6-allrouters

~

[root@centos75 ~]# ansible hadoop -m copy -a "src=/root/hosts dest=/etc/hosts"

192.168.1.20 | SUCCESS => {

"changed": true,

"checksum": "214f72ce3329805c07748997e11313fffb03f667",

"dest": "/etc/hosts",

"gid": 0,

"group": "root",

"md5sum": "127193e1ec4773ce0195636c5ac2bf3a",

"mode": "0644",

"owner": "root",

"size": 298,

"src": "/root/.ansible/tmp/ansible-tmp-1536384515.76-109467000571031/source",

"state": "file",

"uid": 0

}

192.168.1.18 | SUCCESS => {

"changed": true,

"checksum": "214f72ce3329805c07748997e11313fffb03f667",

"dest": "/etc/hosts",

"gid": 0,

"group": "root",

"md5sum": "127193e1ec4773ce0195636c5ac2bf3a",

"mode": "0644",

"owner": "root",

"size": 298,

"src": "/root/.ansible/tmp/ansible-tmp-1536384515.74-269105082907411/source",

"state": "file",

"uid": 0

}

192.168.1.19 | SUCCESS => {

"changed": true,

"checksum": "214f72ce3329805c07748997e11313fffb03f667",

"dest": "/etc/hosts",

"gid": 0,

"group": "root",

"md5sum": "127193e1ec4773ce0195636c5ac2bf3a",

"mode": "0644",

"owner": "root",

"size": 298,

"src": "/root/.ansible/tmp/ansible-tmp-1536384515.75-259083114686776/source",

"state": "file",

"uid": 0

}

还可使用命令查看各客户机hosts文件内容：

ansible hadoop -m shell -a 'cat /etc/hosts'

ansible hadoop -m shell -a 'ls -lhat /etc/hosts'

(2) playbook剧本模式：

启动Hadoop集群服务：

[root@centos75 ~]# vi hadoop-start.yml

---

#“---”符号在yml文件中只能在开头出现一次，多次出现会报错；另外，此符号省略也可，不知为何，待继续研究...

- hosts: hadoop

#注意：“-”符号后必须有空格；“:”后面也必须有空格。

tasks:

#注意：缩进按两个空格规范，不能使用TAB！

- name: startup hadoop datanode services

shell: /root/hadoop-2.7.3/sbin/hadoop-daemon.sh start datanode #尽管集群服务器上已配置hadoop-2.7.3/sbin的环境变量，但这里必须使用绝对路径

- hosts: 192.168.1.18

tasks:

- name: startup hadoop namenode services

shell: /root/hadoop-2.7.3/sbin/hadoop-daemon.sh start namenode

~

[root@centos75 ~]# ansible-playbook hadoop-start.yml

PLAY [hadoop] ******************************************************************

TASK [Gathering Facts] *********************************************************

ok: [192.168.1.20]

ok: [192.168.1.19]

ok: [192.168.1.18]

TASK [startup hadoop datanode services] ****************************************

changed: [192.168.1.19]

changed: [192.168.1.18]

changed: [192.168.1.20]

PLAY [192.168.1.18] ************************************************************

TASK [Gathering Facts] *********************************************************

ok: [192.168.1.18]

TASK [startup hadoop namenode services] ****************************************

changed: [192.168.1.18]

PLAY RECAP *********************************************************************

192.168.1.18 : ok=4 changed=2 unreachable=0 failed=0

192.168.1.19 : ok=2 changed=1 unreachable=0 failed=0

192.168.1.20 : ok=2 changed=1 unreachable=0 failed=0

可在集群服务器上观察服务启动情况：

root@hadoop-master:~# jps

8976 DataNode

9231 Jps

9093 NameNode

root@hadoop-slave1:~# jps

7058 Jps

6972 DataNode

停止hadoop集群服务：

[root@centos75 ~]# vi hadoop-stop.yml

---

- hosts: hadoop

tasks:

- name: stop hadoop datanode services

shell: /root/hadoop-2.7.3/sbin/hadoop-daemon.sh stop datanode

- hosts: 192.168.1.18

tasks:

- name: stop hadoop namenode services

shell: /root/hadoop-2.7.3/sbin/hadoop-daemon.sh stop namenode

~

[root@centos75 ~]# ansible-playbook hadoop-stop.yml

PLAY [hadoop] ******************************************************************

TASK [Gathering Facts] *********************************************************

ok: [192.168.1.20]

ok: [192.168.1.19]

ok: [192.168.1.18]

TASK [stop hadoop datanode services] *******************************************

changed: [192.168.1.20]

changed: [192.168.1.19]

changed: [192.168.1.18]

PLAY [192.168.1.18] ************************************************************

TASK [Gathering Facts] *********************************************************

ok: [192.168.1.18]

TASK [stop hadoop namenode services] *******************************************

changed: [192.168.1.18]

PLAY RECAP *********************************************************************

192.168.1.18 : ok=4 changed=2 unreachable=0 failed=0

192.168.1.19 : ok=2 changed=1 unreachable=0 failed=0

192.168.1.20 : ok=2 changed=1 unreachable=0 failed=0

上述过程可看出，ansible已实现了对集群服务启停作业的集中控制。
相关阅读:
GDB 学习
 常用Linux命令（长期更新）
动态规划专题总结
 awk 简易使用
 mysql相关
 curl 整理
 linux 拆分文件
 Python中的排序方法
 about python
vim 常用命令
原文地址：https://www.cnblogs.com/sfccl/p/11247129.html