python的subprocess：子程序调用（调用执行其他命令）；获取子程序脚本当前路径问题

python当前进程可以调用子进程，子进程可以执行其他命令，如shell，python，java，c...

而调用子进程方法有

os模块

参见：http://blog.csdn.net/longshenlmj/article/details/8331526

而提高版是 subprocess模块,类似os的部分功能，可以说是优化的专项功能类.

python subprocess

用于程序执行时调用子程序，通过stdout,stdin和stderr进行交互。

Stdout子程序执行结果返回，如文件、屏幕等
Stdin 子程序执行时的输入，如文件，文件对象
Stderr错误输出

常用的两种方式（以shell程序为例）：

1，subprocess.Popen('脚本/shell', shell=True)   #无阻塞并行
2，subprocess.call('脚本/shell', shell=True)   #等子程序结束再继续

两者的区别是前者无阻塞,会和主程序并行运行,后者必须等待命令执行完毕,如果想要前者编程阻塞加wait()：

p = subprocess.Popen('脚本/shell', shell=True)
a=p.wait() # 返回子进程结果
具体代码事例：

        hadoop_cmd = "hadoop fs -ls %s"%(hive_tb_path)
        p = subprocess.Popen(hadoop_cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
        ret = p.wait() #wait()函数是等待模式的执行子进程，返回执行命令状态，成功0，失败1
        print ret #执行成功返回0，失败返回1。
        #而命令的结果查看通过
        print p.stdout.read()
        #错误查看通过
        print p.stderr.read()

调用子进程代码实例：

方式一

import subprocess
p=subprocess.Popen('./test/dirtest.py',stdout=subprocess.PIPE,shell=True)
print p.stdout.readlines()  
out,err = p.communicate()
print out
print err

##这是一次性交互，读入是stdin，直接执行完毕后，返回给stdout，communicate通信一次之后即关闭了管道。但如果需要多次交互，频繁地和子线程通信不能使用communicate()， 可以分步进行通信，如下：

    p= subprocess.Popen(["ls","-l"], stdin=subprocess.PIPE,stdout=subprocess.PIPE,shell=False)  
    //输入
    p.stdin.write('your command')  
    p.stdin.flush() 
    //查看输出
    p.stdout.readline() 
    p.stdout.read()

方式二

    ret=subprocess.call('ping -c 1 %s' % ip,shell=True,stdout=open('/dev/null','w'),stderr=subprocess.STDOUT)  
    if ret==0:
        print '%s is alive!' %ip  
    elif ret==1:
        print '%s is down...'%ip

参数shell的意义

    call()和Popen()都有shell参数，默认为False，可以赋值为True。
    参数shell（默认为False）指定是否使用shell来执行程序。如果shell为True，前面会自动加上/bin/sh命令，则建议传递一个字符串（而不是序列）给args，如果为False就必须传列表，分开存储命令内容。比如
    subprocess.Popen("cat test.txt", shell=True)
相当于
    subprocess.Popen(["/bin/sh", "-c", "cat test.txt"])
原因具体是，
    在Linux下，shell=False时, Popen调用os.execvp()执行args指定的程序；
    在Windows下，Popen调用CreateProcess()执行args指定的外部程序，args传入字符和序列都行，序列会自动list2cmdline()转化为字符串，但需要注意的是，并不是MS Windows下所有的程序都可以用list2cmdline来转化为命令行字符串。
    所以，windows下
        subprocess.Popen("notepad.exe test.txt" shell=True)
        等同于
        subprocess.Popen("cmd.exe /C "+"notepad.exe test.txt" shell=True）

shell=True可能引起问题

 传递shell=True在与不可信任的输入绑定在一起时可能出现安全问题
警告 执行的shell命令如果来自不可信任的输入源将使得程序容易受到shell注入攻击，一个严重的安全缺陷可能导致执行任意的命令。因为这个原因，在命令字符串是从外部输入的情况下使用shell=True 是强烈不建议的：
    >>> from subprocess import call
    >>> filename = input("What file would you like to display?
")
    What file would you like to display?
    non_existent; rm -rf / #
    >>> call("cat " + filename, shell=True) # Uh-oh. This will end badly...

shell=False禁用所有基于shell的功能，所以不会受此漏洞影响；参见Popen构造函数文档中的注意事项以得到如何使shell=False工作的有用提示。
当使用shell=True时，pipes.quote()可以用来正确地转义字符串中将用来构造shell命令的空白和shell元字符。

几个介绍subprocess比较详细的网站：

http://python.usyiyi.cn/python_278/library/subprocess.html（英文https://docs.python.org/2/library/subprocess.html）
http://ipseek.blog.51cto.com/1041109/807513
https://blog.linuxeye.com/375.html
http://blog.csdn.net/imzoer/article/details/8678029

子程序脚本的当前路径问题

不管用os还是subprocess调用子程序，都会遇到获取当前路径的问题。即子程序脚本代码中想要获取当前路径，那么获取的路径是主程序还是子程序的？
Python获取脚本路径的方式主要有两种：
    1）os.path.dirname(os.path.abspath("__file__"))
    2）sys.path[0]
参考http://blog.csdn.net/longshenlmj/article/details/25148935， 
    第一种会获取主程序的路径，也就是当前的__file__对象存的是主程序脚本
    第二种才能获取子程序脚本的路径

代码实例：

主程序脚本callpy.py路径为/home/wizad/lmj，
调用的子程序脚本dirtest.py路径为/home/wizad/lmj/test

[wizad@srv26 lmj]$ cat callpy.py

import subprocess
p = subprocess.Popen('python ./test/dirtest.py',stdout=open('dirtest.txt','w'),shell=True)

[wizad@srv26 test]$ cat dirtest.py

import os
import sys
file_path=os.path.dirname(os.path.abspath("__file__"))
print file_path+"11111"
cur_path = sys.path[0]
print cur_path+"22222"

执行python callpy.py结果输出：cat dirtest.txt

/home/wizad/lmj11111
/home/wizad/lmj/test22222

输出结果是放到文件dirtest.txt中，可以看出方式1是主程序路径，而方式2是子程序路径。
另外，stdout的输出方式还可以是PIPE，读取的方式可以直接打印，
如，
1）

p = subprocess.Popen('python ./test/dirtest.py',stdout=subprocess.PIPE,shell=True)
out,err = p.communicate()
print out
print err

输出：[wizad@srv26 lmj]$ python callpy.py

/home/wizad/lmj11111
/home/wizad/lmj/test22222

None

2）

p = subprocess.Popen('python ./test/dirtest.py',stdout=subprocess.PIPE,shell=True)
print p.stdout.readlines()  
out,err = p.communicate()
print out
print err

输出为

['/home/wizad/lmj11111
', '/home/wizad/lmj/test22222
']

None

这两种读取方式，是直接通过屏幕输出结果。

有关subprocess模块其他知识，引用一些资料如下：

subprocess.Popen(
      args, 
      bufsize=0, 
      executable=None,
      stdin=None,
      stdout=None, 
      stderr=None, 
      preexec_fn=None, 
      close_fds=False, 
      shell=False, 
      cwd=None, 
      env=None, 
      universal_newlines=False, 
      startupinfo=None, 
      creationflags=0)

这里写图片描述
1)、args可以是字符串或者序列类型（如：list，元组），用于指定进程的可执行文件及其参数。如果是序列类型，第一个元素通常是可执行文件的路径。我们也可以显式的使用executeable参数来指定可执行文件的路径。
2)、bufsize：指定缓冲。0 无缓冲,1 行缓冲,其他缓冲区大小,负值系统缓冲(全缓冲)
3)、stdin, stdout, stderr分别表示程序的标准输入、输出、错误句柄。他们可以是PIPE，文件描述符或文件对象，也可以设置为None，表示从父进程继承。
4)、preexec_fn只在Unix平台下有效，用于指定一个可执行对象（callable object），它将在子进程运行之前被调用。
5)、Close_sfs：在windows平台下，如果close_fds被设置为True，则新创建的子进程将不会继承父进程的输入、输出、错误管道。我们不能将close_fds设置为True同时重定向子进程的标准输入、输出与错误(stdin, stdout, stderr)。
6)、shell设为true，程序将通过shell来执行。
7)、cwd用于设置子进程的当前目录
8)、env是字典类型，用于指定子进程的环境变量。如果env = None，子进程的环境变量将从父进程中继承。Universal_newlines:不同操作系统下，文本的换行符是不一样的。如：windows下用’/r/n’表示换，而Linux下用’/n’。如果将此参数设置为True，Python统一把这些换行符当作’/n’来处理。
9)、startupinfo与createionflags只在windows下有效，它们将被传递给底层的CreateProcess()函数，用于设置子进程的一些属性，如：主窗口的外观，进程的优先级等等。

Popen方法
1)、Popen.poll()：用于检查子进程是否已经结束。设置并返回returncode属性。
2)、Popen.wait()：等待子进程结束。设置并返回returncode属性。
3)、Popen.communicate(input=None)：与子进程进行交互。向stdin发送数据，或从stdout和stderr中读取数据。可选参数input指定发送到子进程的参数。Communicate()返回一个元组：(stdoutdata, stderrdata)。注意：如果希望通过进程的stdin向其发送数据，在创建Popen对象的时候，参数stdin必须被设置为PIPE。同样，如果希望从stdout和stderr获取数据，必须将stdout和stderr设置为PIPE。
4)、Popen.send_signal(signal)：向子进程发送信号。
5)、Popen.terminate()：停止(stop)子进程。在windows平台下，该方法将调用Windows API TerminateProcess（）来结束子进程。
6)、Popen.kill()：杀死子进程。
7)、Popen.stdin：如果在创建Popen对象是，参数stdin被设置为PIPE，Popen.stdin将返回一个文件对象用于策子进程发送指令。否则返回None。
8)、Popen.stdout：如果在创建Popen对象是，参数stdout被设置为PIPE，Popen.stdout将返回一个文件对象用于策子进程发送指令。否则返回None。
9)、Popen.stderr：如果在创建Popen对象是，参数stdout被设置为PIPE，Popen.stdout将返回一个文件对象用于策子进程发送指令。否则返回None。
10)、Popen.pid：获取子进程的进程ID。
11)、Popen.returncode：获取进程的返回值。如果进程还没有结束，返回None。
12)、subprocess.call(*popenargs, **kwargs)：运行命令。该函数将一直等待到子进程运行结束，并返回进程的returncode。文章一开始的例子就演示了call函数。如果子进程不需要进行交互,就可以使用该函数来创建。
13)、subprocess.check_call(*popenargs, **kwargs)：与subprocess.call(*popenargs, **kwargs)功能一样，只是如果子进程返回的returncode不为0的话，将触发CalledProcessError异常。在异常对象中，包括进程的returncode信息。

死锁

使用管道时，不去处理管道的输出，当   子进程输出了大量数据到stdout或者stderr的管道，并达到了系统pipe的缓存大小的话（操作系统缓存无法获取更多信息），子进程会等待父进程读取管道，而父进程此时正wait着的话，将会产生传说中的死锁。
可能引起死锁的调用：
    subprocess.call()
    subprocess.check_call()
    subprocess.check_output()
    Popen.wait()
    可以看出，子进程使用管道交互，如果需要等待子进程完毕，就可能引起死锁。比如下面的用法：

    p=subprocess.Popen("longprint", shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)  
    p.wait()

longprint是一个假想的有大量输出的进程，那么在我的xp, Python2.5的环境下，当输出达到4096时，死锁就发生了。

避免subprocess的管道引起死锁

1）使用Popen()和communicate()方法，可以避免死锁。没有等待，会自动清理缓存。
2）如果用p.stdout.readline（或者p.communicate）去清理输出，那么无论输出多少，死锁都是不会发生的。
3)或者不用管道，比如不做重定向，或者重定向到文件，也可以避免死锁。

相关阅读:
rMATs分析single-end 数据，结果文件为空？
Error in inherits(x, "theme") : argument "e2" is missing, with no default
R 变量名开头不能为数字
 linux 下的通配符和正则表达式不一样
 samtools的一些问题
 Error in C(1, 2) : object not interpretable as a factor
grep -w 正确使用，结果却不正确的原因之一
 慎用rm命令和*
定义默认字典值为列表类型
 C语言考题：Find the key in the picture,good luck..
原文地址：https://www.cnblogs.com/cl1024cl/p/6205372.html