一、环境:
服务器一台,已安装centos7.5系统,做ansible服务器;
客户机三台:hadoop-master(192.168.1.18)、hadoop-slave1(192.168.1.19)、hadoop-slave2(192.168.1.20)
二、ansible软件安装:
[root@centos75 ~]# yum install ansible
三、ansible配置过程:
1、服务器与客户机之间的免密配置:
(1)生成密钥: ssh-keygen -t rsa
(2)传递密钥:
[root@centos75 ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub root@192.168.1.18
[root@centos75 ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub root@192.168.1.19
[root@centos75 ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub root@192.168.1.20
2、ansible配置
(1)Inventory主机清单配置:
[root@centos75 ~]# vi /etc/ansible/hosts
...
[hadoop]
192.168.1.[18:20] #这是一种IP地址段表示方式,也可单列每个IP地址。
(2)配置ansible.cfg:
[root@centos75 ~]# vi /etc/ansible/ansible.cfg
...
host_key_checking = False #禁用每次执行ansbile命令检查ssh key host
...
log_path = /var/log/ansible.log #开启日志记录
...
[accelerate] #ansible连接加速配置
#accelerate_port = 5099
accelerate_port = 10000
...
accelerate_multi_key = yes
...
deprecation_warnings = False #屏蔽弃用告警提示,减少不必要的信息显示
...
四、测试
[root@centos75 ~]# ansible all -m ping
192.168.1.20 | SUCCESS => {
"changed": false,
"ping": "pong"
}
192.168.1.18 | SUCCESS => {
"changed": false,
"ping": "pong"
}
192.168.1.19 | SUCCESS => {
"changed": false,
"ping": "pong"
}
上述信息表明ansible管理对象已全部ping通,ansible配置正常。
五、使用示例
(1) Ad-Hoc模式:
修改Hadoop三台集群服务器的/etc/hosts文件:
[root@centos75 ~]# vi hosts
#127.0.1.1 hadoop-master
192.168.1.18 hadoop-master
192.168.1.19 hadoop-slave1
192.168.1.20 hadoop-slave2
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
~
[root@centos75 ~]# ansible hadoop -m copy -a "src=/root/hosts dest=/etc/hosts"
192.168.1.20 | SUCCESS => {
"changed": true,
"checksum": "214f72ce3329805c07748997e11313fffb03f667",
"dest": "/etc/hosts",
"gid": 0,
"group": "root",
"md5sum": "127193e1ec4773ce0195636c5ac2bf3a",
"mode": "0644",
"owner": "root",
"size": 298,
"src": "/root/.ansible/tmp/ansible-tmp-1536384515.76-109467000571031/source",
"state": "file",
"uid": 0
}
192.168.1.18 | SUCCESS => {
"changed": true,
"checksum": "214f72ce3329805c07748997e11313fffb03f667",
"dest": "/etc/hosts",
"gid": 0,
"group": "root",
"md5sum": "127193e1ec4773ce0195636c5ac2bf3a",
"mode": "0644",
"owner": "root",
"size": 298,
"src": "/root/.ansible/tmp/ansible-tmp-1536384515.74-269105082907411/source",
"state": "file",
"uid": 0
}
192.168.1.19 | SUCCESS => {
"changed": true,
"checksum": "214f72ce3329805c07748997e11313fffb03f667",
"dest": "/etc/hosts",
"gid": 0,
"group": "root",
"md5sum": "127193e1ec4773ce0195636c5ac2bf3a",
"mode": "0644",
"owner": "root",
"size": 298,
"src": "/root/.ansible/tmp/ansible-tmp-1536384515.75-259083114686776/source",
"state": "file",
"uid": 0
}
还可使用命令查看各客户机hosts文件内容:
ansible hadoop -m shell -a 'cat /etc/hosts'
ansible hadoop -m shell -a 'ls -lhat /etc/hosts'
(2) playbook剧本模式:
启动Hadoop集群服务:
[root@centos75 ~]# vi hadoop-start.yml
---
#“---”符号在yml文件中只能在开头出现一次,多次出现会报错;另外,此符号省略也可,不知为何,待继续研究...
- hosts: hadoop
#注意:“-”符号后必须有空格;“:”后面也必须有空格。
tasks:
#注意:缩进按两个空格规范,不能使用TAB!
- name: startup hadoop datanode services
shell: /root/hadoop-2.7.3/sbin/hadoop-daemon.sh start datanode #尽管集群服务器上已配置hadoop-2.7.3/sbin的环境变量,但这里必须使用绝对路径
- hosts: 192.168.1.18
tasks:
- name: startup hadoop namenode services
shell: /root/hadoop-2.7.3/sbin/hadoop-daemon.sh start namenode
~
[root@centos75 ~]# ansible-playbook hadoop-start.yml
PLAY [hadoop] ******************************************************************
TASK [Gathering Facts] *********************************************************
ok: [192.168.1.20]
ok: [192.168.1.19]
ok: [192.168.1.18]
TASK [startup hadoop datanode services] ****************************************
changed: [192.168.1.19]
changed: [192.168.1.18]
changed: [192.168.1.20]
PLAY [192.168.1.18] ************************************************************
TASK [Gathering Facts] *********************************************************
ok: [192.168.1.18]
TASK [startup hadoop namenode services] ****************************************
changed: [192.168.1.18]
PLAY RECAP *********************************************************************
192.168.1.18 : ok=4 changed=2 unreachable=0 failed=0
192.168.1.19 : ok=2 changed=1 unreachable=0 failed=0
192.168.1.20 : ok=2 changed=1 unreachable=0 failed=0
可在集群服务器上观察服务启动情况:
root@hadoop-master:~# jps
8976 DataNode
9231 Jps
9093 NameNode
root@hadoop-slave1:~# jps
7058 Jps
6972 DataNode
停止hadoop集群服务:
[root@centos75 ~]# vi hadoop-stop.yml
---
- hosts: hadoop
tasks:
- name: stop hadoop datanode services
shell: /root/hadoop-2.7.3/sbin/hadoop-daemon.sh stop datanode
- hosts: 192.168.1.18
tasks:
- name: stop hadoop namenode services
shell: /root/hadoop-2.7.3/sbin/hadoop-daemon.sh stop namenode
~
[root@centos75 ~]# ansible-playbook hadoop-stop.yml
PLAY [hadoop] ******************************************************************
TASK [Gathering Facts] *********************************************************
ok: [192.168.1.20]
ok: [192.168.1.19]
ok: [192.168.1.18]
TASK [stop hadoop datanode services] *******************************************
changed: [192.168.1.20]
changed: [192.168.1.19]
changed: [192.168.1.18]
PLAY [192.168.1.18] ************************************************************
TASK [Gathering Facts] *********************************************************
ok: [192.168.1.18]
TASK [stop hadoop namenode services] *******************************************
changed: [192.168.1.18]
PLAY RECAP *********************************************************************
192.168.1.18 : ok=4 changed=2 unreachable=0 failed=0
192.168.1.19 : ok=2 changed=1 unreachable=0 failed=0
192.168.1.20 : ok=2 changed=1 unreachable=0 failed=0
上述过程可看出,ansible已实现了对集群服务启停作业的集中控制。