并行执行命令工具简介
作为运维工程师来讲,机器数量到一定级别的时候,批量运维和管理机器就是一件费神
的事情,还好有很多可以批量并行执行命令的工具,比如 pssh , python fabric
taobao 有在pssh基础之上改造的pgm. 这几个工具都可以帮助我们批量运行命令。当然
随着 puppet, ansible等工具的流行这些并行工具变的弱化了,不过依然还是很有用,今天我们来讲述一下 pssh 的使用方式
python并行执行命令工具
之前在阿里工作的时候,并行工具是pgm , 目前可以选择的工具如下
fabric pssh pgm
pssh 官方介绍
PSSH provides parallel versions of OpenSSH and related tools.
Included are pssh, pscp, prsync, pnuke, and pslurp. The project includes psshlib which can be used within custom applications.
PSSH is supported on Python 2.4 and greater (including Python 3.1 and greater). It was originally written and maintained by Brent N. Chun. Due to his busy schedule, Brent handed over maintenance to Andrew McNabb in October 2009.
pssh 安装部署
下载 parallel-ssh 并安装
git clone https://github.com/ruiaylin/pssh.git
安装 install
cd pssh/
python setup.py install
配置待批量管理的服务器列表
host configuration like this : pssh_config
192.168.102.81:10000 192.168.8.183:10000
pssh 也可以配置成为不同的 group ,可以根据不同的组做对应的操作,比如说不同的集群,不同的角色都是可以的。后面有简单的测试。
hostgroups configuration like this :
root@ruiaylin-virtual-machine:~/batch# cat /etc/pssh/hostgroups master: 192.168.19.132,192.168.19.135 slave: 192.168.19.134
为了管理方便,需要打通管理机器(有些公司叫做跳板机,也有叫做堡垒机)到各个主
机的信任通道,这样会避免每次ssh操作都需要输入密码,机器多的时候会真的疯掉的
打通信任通道 :
cd mkdir .ssh ssh-keygen -t dsa cd .ssh ; ll [root@xxxxxx .ssh]# ll total 32 -rw------- 1 root root 1588 Nov 19 14:29 authorized_keys -rw------- 1 root root 668 Sep 11 16:15 id_dsa -rw-r--r-- 1 root root 602 Sep 11 16:15 id_dsa.pub -rw-r--r--. 1 root root 14490 Nov 13 14:58 known_hosts #然后将 id_dsa.pub 文件内容 copy 到 各个主机的 /home/youruser/.ssh/authorized_keys 文件中 , 打通完毕, 如果 操作完成,之后仍然无法直接ssh 登录,问题可能出在 authorized_keys 该文件的属性上面。 一般设置为700
examples :
pssh to execute command
pssh options
OPTIONS -h host_file # -h + 包含IP的文件名 --hosts host_file -H [user@]host[:port] # -H + <span style="font-family: Arial, Helvetica, sans-serif;">[用户@]主机IP[:端口] [ ]内的是可选参数 ,若有多个主机,用" "引起来,主机之间用空格分开</span> --host [user@]host[:port] -H "[user@]host[:port] [ [user@]host[:port ] ... ]" <span style="white-space:pre"> </span> --host "[user@]host[:port] [ [user@]host[:port ] ... ]" -l user # -l + 用户名(用于连接远程主机的用户名) --user user -p parallelism # -p + 并发数 --par parallelism -t timeout # -t + 超时秒数 --timeout timeout -o outdir # -o + 输出目录 说明:会在该目录下创建 <span style="font-family: Arial, Helvetica, sans-serif;">[用户@]主机IP[:端口]</span><span style="font-family: Arial, Helvetica, sans-serif;"> 格式的文件名,用于保存输出结果</span> --outdir outdir -e errdir # -e + 错误输出目录 --errdir errdir -x args # -x + ssh连接时可提供的参数 ,例: -x "-o StrictHostKeyChecking=no" 表示跳过ssh链接时询问yes/no --extra-args args -X arg --extra-arg arg -O options # -O + SSH配置文件中的选项 可以出现多个 -O 选项 --options options -A --askpass -i # -i 参数用于将输出结果直接显示在当前终端 --inline --inline-stdout -v # -v 参数用于显示ssh连接时的错误信息 --verbose -I --send-input Read input and send to each ssh process. Since ssh allows a command script to be sent on standard input, the -I option may be used in lieu of the command argument. -P # -P 参数用于当主机连接上之后,输出执行结果,先输出执行结果, 再显示连接 的主机信息. --print
执行命令 , 并check
#创建几个目录 pssh -h pssh_config -l root -i 'mkdir -p /root/works/{script,tmp,log} ' #check 刚才创建的结果 [root@dbtaskm works]# pssh -h pssh_config -l root -i 'ls /root/works/ ' [1] 14:12:24 [SUCCESS] 192.168.102.81:10000 log script tmp [2] 14:12:24 [SUCCESS] 192.168.8.183:10000 log script tmp
多条命令要用分好分割
pssh -h pssh_config -l root -i 'cd /root/works/ ; ls ' #执行结果 [root@dbtaskm works]# pssh -h pssh_config -l root -i 'cd /root/works/ ; ls ' [1] 14:13:33 [SUCCESS] 192.168.8.183:10000 log script tmp [2] 14:13:33 [SUCCESS] 192.168.102.81:10000 log script tmp # 关闭selinux pssh -h servers.txt -l root -P "sed -i '/SELINUX=enforcing/s/SELINUX=enforcing/SELINUX=disabled/'/etc/sysconfig/selinux"
pscp 集中分到文件到 主机列表的机器
将文件 collect-mysql.py 分发到机器列表对应目录
pscp -h pssh_config collect-mysql.py /root/works/tmp/ #check 执行结果 pssh -h pssh_config -l root -i 'cd /root/works/tmp ; ls ' [1] 14:18:09 [SUCCESS] 192.168.102.81:10000 collect-mysql.py [2] 14:18:09 [SUCCESS] 192.168.8.183:10000 collect-mysql.py
如果是包含文件夹 ,请使用 如下命令
pscp -h pssh_config -l root -r /root/bin/* /root/bin/
slurp copy文件到管理机器
pslurp -L /root/works/testlurp/ -h ../pssh_config /root/works/tmp/collect-mysql.py mysql.py # [1] 14:53:55 [SUCCESS] 192.168.102.81:10000 [2] 14:53:55 [SUCCESS] 192.168.8.183:10000 # check the result [root@dbtaskm testlurp]# cd /root/works/testlurp/ ; ls * 192.168.102.81: mysql.py 192.168.8.183: mysql.py
pnuke
The pnuke command is useful when you want to kill a bunch of processes on a set of machines. For example, suppose you’ve got a bunch of java processes running on three nodes that you’d like to nuke (let’s use the three machines from the pssh example). Here you would do the following:
# pnuke -h ips.txt -l irb2 java Success on 128.112.152.122:22 Success on 18.31.0.190:22 Success on 128.232.103.201:22
hostgroup 单独测试
配置 /etc/pssh/hostgroups
root@ruiaylin-virtual-machine:~/batch# cat /etc/pssh/hostgroups master: 192.168.19.132,192.168.19.135 slave: 192.168.19.134
pssh :
root@ruiaylin-virtual-machine:/etc/pssh# pssh -g master -i hostname [1] 14:20:06 [SUCCESS] 192.168.19.132 mytestdb02 [2] 14:20:06 [SUCCESS] 192.168.19.135 mytestdb01 root@ruiaylin-virtual-machine:/etc/pssh# pssh -g master -i 'ifconfig |grep inet | grep -v 127 ' [1] 14:20:35 [SUCCESS] 192.168.19.135 inet addr:192.168.19.135 Bcast:192.168.19.255 Mask:255.255.255.0 inet6 addr: fe80::20c:29ff:fe2a:d6db/64 Scope:Link inet6 addr: ::1/128 Scope:Host [2] 14:20:35 [SUCCESS] 192.168.19.132 inet addr:192.168.19.132 Bcast:192.168.19.255 Mask:255.255.255.0 inet6 addr: fe80::20c:29ff:fe78:dfd8/64 Scope:Link inet6 addr: ::1/128 Scope:Host root@ruiaylin-virtual-machine:/etc/pssh# root@ruiaylin-virtual-machine:/etc/pssh# root@ruiaylin-virtual-machine:/etc/pssh# pssh -g slave -i hostname [1] 14:20:50 [SUCCESS] 192.168.19.134 mytaskdb root@ruiaylin-virtual-machine:/etc/pssh#
pscp :
本地create 两个file 用于测试:
root@ruiaylin-virtual-machine:~/batch# ls file1 file2 file1 file2 root@ruiaylin-virtual-machine:~/batch# cat file1 file2 test master test slave root@ruiaylin-virtual-machine:~/batch#
执行
root@ruiaylin-virtual-machine:~/batch# pscp -g slave file2 /root/bin/filetest [1] 14:22:28 [SUCCESS] 192.168.19.134 root@ruiaylin-virtual-machine:~/batch# pscp -g master file1 /root/bin/filetest [1] 14:22:36 [SUCCESS] 192.168.19.132 [2] 14:22:36 [SUCCESS] 192.168.19.135 root@ruiaylin-virtual-machine:~/batch#
结果
root@ruiaylin-virtual-machine:~/batch# pssh -g master -i 'cat /root/bin/filetest ' [1] 14:23:02 [SUCCESS] 192.168.19.132 test master [2] 14:23:02 [SUCCESS] 192.168.19.135 test master root@ruiaylin-virtual-machine:~/batch# pssh -g slave -i 'cat /root/bin/filetest ' [1] 14:23:10 [SUCCESS] 192.168.19.134 test slave root@ruiaylin-virtual-machine:~/batch#
总结
pssh 是基于python的一个batch 管理主机的工具, 现在也有 python 的 fabric 模块,也可以完成这类似的工具, 后面有时间可以总结一下。