• Python3线程池进程池


    几篇博客的学习总结

    参考:https://www.cnblogs.com/kaituorensheng/p/4465768.html#_label0

    https://www.cnblogs.com/lilyxiaoyy/p/11041239.html

    在利用Python进行系统管理的时候,特别是同时操作多个文件目录,或者远程控制多台主机,并行操作可以节约大量的时间。当被操作对象数目不大时,可以直接利用multiprocessing中的Process动态成生多个进程,十几个还好,但如果是上百个,上千个目标,手动的去限制进程数量却又太过繁琐,此时可以发挥进程池的功效。
    Pool可以提供指定数量的进程供用户调用,当有新的请求提交到pool中时,如果池还没有满,那么就会创建一个新的进程用来执行该请求;但如果池中的进程数已经达到规定最大值,那么该请求就会等待,直到池中有进程结束,才会创建新的进程来它。

    例1 使用进程池

     1 import multiprocessing
     2 import time, os
     3 def func(msg):
     4     print(f"msg: {msg} processed by {os.getpid()} and parent pid is {os.getppid()}")
     5     time.sleep(3)
     6     print("end")
     7 if __name__ == "__main__":
     8     print(os.getpid())
     9     pool = multiprocessing.Pool(processes=4)
    10     for i in range(5):
    11         msg = "hello %d" % (i)
    12         pool.apply_async(func, (msg, ))
    13         #  维持执行的进程总数为processes,当一个进程执行完毕后会添加新的进程进去
    14     print("Mark~ Mark~ Mark~~~~~~~~~~~~~~~~~~~~~~")
    15     pool.close()
    16     pool.join()
    17     # 调用join之前,先调用close函数,否则会出错。执行完close后不会有新的进程加入到pool,
    18     # join函数等待所有子进程结束
    19     print("Sub-process done.")

    某次运行输出结果:

    17296
    Mark~ Mark~ Mark~~~~~~~~~~~~~~~~~~~~~~
    msg: hello 0 processed by 47052 and parent pid is 17296
    msg: hello 1 processed by 15264 and parent pid is 17296
    msg: hello 2 processed by 52680 and parent pid is 17296
    msg: hello 3 processed by 18600 and parent pid is 17296
    end
    end
    msg: hello 4 processed by 47052 and parent pid is 17296
    end
    end
    end
    Sub-process done.

    函数解释:

    • apply_async(func, args=(), kwds={}, callback=None, error_callback=None) 它是非阻塞,apply(func, args=(), kwds={})是阻塞的(理解区别,看例1例2结果区别)
    • close()    关闭pool,使其不在接受新的任务。
    • terminate()    结束工作进程,不在处理未完成的任务。
    • join()    主进程阻塞,等待子进程的退出, join方法要在close或terminate之后使用。

    因为为非阻塞,主函数会自己执行自个的,不搭理进程的执行,所以运行完for循环后直接输出“mMsg: hark~ Mark~ Mark~~~~~~~~~~~~~~~~~~~~~~”,主程序在pool.join()处等待各个进程的结束。

    例2 使用进程池(阻塞)

     1 import multiprocessing
     2 import time, os
     3 def func(msg):
     4     print(f"msg: {msg} processed by {os.getpid()} and parent pid is {os.getppid()}")
     5     time.sleep(3)
     6     print("end")
     7 if __name__ == "__main__":
     8     print(os.getpid())
     9     pool = multiprocessing.Pool(processes=4)
    10     for i in range(5):
    11         msg = "hello %d" % (i)
    12         # pool.apply_async(func, (msg, ))
    13         pool.apply(func, (msg, ))
    14         #  维持执行的进程总数为processes,当一个进程执行完毕后会添加新的进程进去
    15     print("Mark~ Mark~ Mark~~~~~~~~~~~~~~~~~~~~~~")
    16     pool.close()
    17     pool.join()
    18     # 调用join之前,先调用close函数,否则会出错。执行完close后不会有新的进程加入到pool,
    19     # join函数等待所有子进程结束
    20     print("Sub-process done.")

    某次运行执行结果:

    3552
    msg: hello 0 processed by 40088 and parent pid is 3552
    end
    msg: hello 1 processed by 44100 and parent pid is 3552
    end
    msg: hello 2 processed by 13824 and parent pid is 3552
    end
    msg: hello 3 processed by 24148 and parent pid is 3552
    end
    msg: hello 4 processed by 40088 and parent pid is 3552
    end
    Mark~ Mark~ Mark~~~~~~~~~~~~~~~~~~~~~~
    Sub-process done.

    阻塞的效果:本代码块变为串行的

    例3 使用进程池(并关注结果)

     1 import multiprocessing
     2 import time, os
     3 def func(msg):
     4     print(f"msg: {msg} processed by {os.getpid()} and parent pid is {os.getppid()}")
     5     time.sleep(3)
     6     print("end")
     7     return "done " + msg
     8 if __name__ == "__main__":
     9     pool = multiprocessing.Pool(processes=4)
    10     result = []
    11     for i in range(5):
    12         msg = "hello %d" % (i)
    13         result.append(pool.apply_async(func, (msg, )))
    14     pool.close()
    15     pool.join()
    16     for res in result:
    17         print(":::", res.get())
    18     print("Sub-process(es) done.")

    某次运行执行结果:

    msg: hello 0 processed by 50408 and parent pid is 40748
    msg: hello 1 processed by 6236 and parent pid is 40748
    msg: hello 2 processed by 52408 and parent pid is 40748
    msg: hello 3 processed by 52316 and parent pid is 40748
    end
    end
    msg: hello 4 processed by 50408 and parent pid is 40748
    end
    end
    end
    ::: done hello 0
    ::: done hello 1
    ::: done hello 2
    ::: done hello 3
    ::: done hello 4
    Sub-process(es) done.

    使用get()函数得到每个返回结果的值

    例4 使用多个进程池(执行多个不同任务)

    import multiprocessing
    import os, time, random
    def Lee():
        print("
    Run task Lee-%s" % (os.getpid()))  # os.getpid()获取当前的进程的ID
        start = time.time()
        time.sleep(random.random() * 10)  # random.random()随机生成0-1之间的小数
        end = time.time()
        print('Task Lee, runs %0.2f seconds.' % (end - start))
    def Marlon():
        print("
    Run task Marlon-%s" % (os.getpid()))
        start = time.time()
        time.sleep(random.random() * 40)
        end = time.time()
        print('Task Marlon runs %0.2f seconds.' % (end - start))
    def Allen():
        print("
    Run task Allen-%s" % (os.getpid()))
        start = time.time()
        time.sleep(random.random() * 30)
        end = time.time()
        print('Task Allen runs %0.2f seconds.' % (end - start))
    def Frank():
        print("
    Run task Frank-%s" % (os.getpid()))
        start = time.time()
        time.sleep(random.random() * 20)
        end = time.time()
        print('Task Frank runs %0.2f seconds.' % (end - start))
    if __name__ == '__main__':
        function_list = [Lee, Marlon, Allen, Frank]
        print("parent process %s" % (os.getpid()))
        pool = multiprocessing.Pool(4)
        for func in function_list:
            pool.apply_async(func)  # Pool执行函数,apply执行函数,当有一个进程执行完毕后,
                                    # 会添加一个新的进程到pool中
        print('Waiting for all subprocesses done...')
        pool.close()
        pool.join()  # 调用join之前,一定要先调用close() 函数,否则会出错, 
        # close()执行后不会有新的进程加入到pool,join函数等待素有子进程结束
        print('All subprocesses done.')

    某次运行执行结果:

    parent process 29224
    Waiting for all subprocesses done...
    
    Run task Lee-33772
    
    Run task Marlon-14784
    
    Run task Allen-24860
    
    Run task Frank-29684
    Task Lee, runs 2.53 seconds.
    Task Allen runs 6.69 seconds.
    Task Marlon runs 9.31 seconds.
    Task Frank runs 15.88 seconds.
    All subprocesses done.

    例5 multiprocessing pool map

     1 import multiprocessing
     2 import os
     3 def m1(x):
     4     print('%s is running and parent is %s'%(os.getpid(),os.getppid()))
     5     print(x * x)
     6 if __name__ == '__main__':
     7     print(os.getpid())
     8     # pool = multiprocessing.Pool(multiprocessing.cpu_count())
     9     pool = multiprocessing.Pool(4)
    10     print(multiprocessing.cpu_count())
    11     i_list = range(8)
    12 pool.map(m1, i_list)
    pool = multiprocessing.Pool(4)
    >>>
    26040
    4
    23768 is running and parent is 26040
    0
    20896 is running and parent is 26040
    1
    23768 is running and parent is 26040
    4
    20896 is running and parent is 26040
    9
    20896 is running and parent is 26040
    25
    23768 is running and parent is 26040
    16
    30380 is running and parent is 26040
    49
    20896 is running and parent is 26040
    36

    问题:http://bbs.chinaunix.net/thread-4111379-1-1.html

     1 import multiprocessing
     2 
     3 class someClass(object):
     4     def __init__(self):
     5         pass
     6     def f(self, x):
     7         return x*x
     8     def go(self):
     9         pool = multiprocessing.Pool(processes=4)
    10         result = pool.apply_async(self.f, [10])
    11         print(result.get())
    12         print(pool.map(self.f, range(10)))
    13 if __name__ == '__main__':  #没有这个语句会报错
    14     s = someClass()
    15     s.go()
    import multiprocessing
    import logging
    def create_logger(i):
        print(i)
    class CreateLogger(object):
        def __init__(self, func):
            self.func = func
    if __name__ == '__main__':
        ilist = range(10)
        cl = CreateLogger(create_logger)
        pool = multiprocessing.Pool(multiprocessing.cpu_count())
        pool.map(cl.func, ilist)
        print("hello------------>")

    例6 进程池的另一种使用方法

    concurrent.futures模块

     1 import time
     2 from concurrent.futures import ProcessPoolExecutor
     3 def func(name):
     4     print(f"{name}开始")
     5     time.sleep(0.5)
     6     print(f"{name}结束")
     7 if __name__ == '__main__':
     8     p = ProcessPoolExecutor(max_workers=3)  # 创建一个进程池
     9     for i in range(1, 10):
    10         p.submit(func, f"进程{i}")  # 往进程池内提交任务
    11     p.shutdown()  # 主进程等待子进程结束
    12     print("主进程结束")

    例7 线程池的使用

    import time
    from concurrent.futures import ThreadPoolExecutor
    def func(name):
        print(f"{name}开始")
        time.sleep(0.5)
        print(f"{name}结束")
    if __name__ == '__main__':
        p = ThreadPoolExecutor(max_workers=3)  # 创建一个线程池,里面最多有3个线程同时工作
        for i in range(1, 10):
            p.submit(func, f"线程{i}")
        p.shutdown()  # 主线程等待子线程结束  
        print("主线程结束")

    在concurrent.futures模块中,线程池和进程池的调用方法是一样的。该例同时展示了获取返回值的方式,调用result方法。

    import os
    import time
    from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor
    import threading
    import random
    def f(n):
        time.sleep(random.randint(1, 3))
        # print(n)
        # print("进程(%s) %s的平方: %s" % (os.getpid(), n, n*n))
        print("线程(%s) %s的平方: %s" % (threading.current_thread().getName(), n, n * n))
        return n * n
    if __name__ == '__main__':
        pool = ThreadPoolExecutor(max_workers=5)
        # pool = ProcessPoolExecutor(max_workers=5)
        ret_list = []
        for i in range(10):
            ret = pool.submit(f, i)  # 异步提交任务,f函数名称或者方法名称,i给f函数的参数
            # print(ret.result())  #join
            ret_list.append(ret)
        # pool.shutdown()  #锁定线程池,不让新任务再提交进来了.轻易不用
        for i in ret_list:
            print(i.result())
    线程(ThreadPoolExecutor-0_2) 2的平方: 4
    线程(ThreadPoolExecutor-0_0) 0的平方: 0
    线程(ThreadPoolExecutor-0_3) 3的平方: 9
    0
    线程(ThreadPoolExecutor-0_1) 1的平方: 1
    1
    4
    9
    线程(ThreadPoolExecutor-0_3) 7的平方: 49线程(ThreadPoolExecutor-0_4) 4的平方: 16
    线程(ThreadPoolExecutor-0_2) 5的平方: 25
    16
    25
    线程(ThreadPoolExecutor-0_0) 6的平方: 36
    36
    49
    线程(ThreadPoolExecutor-0_4) 9的平方: 81
    线程(ThreadPoolExecutor-0_1) 8的平方: 64
    64
    81
    使用了shutdown方法的输出结果>>>
    线程(ThreadPoolExecutor-0_0) 0的平方: 0
    线程(ThreadPoolExecutor-0_1) 1的平方: 1
    线程(ThreadPoolExecutor-0_4) 4的平方: 16
    线程(ThreadPoolExecutor-0_3) 3的平方: 9线程(ThreadPoolExecutor-0_0) 5的平方: 25
    线程(ThreadPoolExecutor-0_2) 2的平方: 4
    
    线程(ThreadPoolExecutor-0_1) 6的平方: 36
    线程(ThreadPoolExecutor-0_2) 9的平方: 81线程(ThreadPoolExecutor-0_0) 8的平方: 64
    
    线程(ThreadPoolExecutor-0_4) 7的平方: 49
    0
    1
    4
    9
    16
    25
    36
    49
    64
    81
     1 import time
     2 from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor
     3 import random
     4 def f(n):
     5     time.sleep(random.randint(1, 3))
     6     return n * n
     7 def call_back(m):
     8     print(m)
     9     print(m.result())
    10 if __name__ == '__main__':
    11     pool = ThreadPoolExecutor(max_workers=5)
    12     pool.submit(f, 2).add_done_callback(call_back)
  • 相关阅读:
    【转】android新建项目时 出现appcompat_v7工程错误和红色感叹号
    【转】eclipse新建项目,报错“Error: workspaceappcompat_v7 esvalues-v21styles_base.xml No resource found that matches the given name”
    【转】android开发工具Eclipse,androidStudio,adt网盘下载--不错
    【转】Windows搭建Eclipse+JDK+SDK的Android
    【转】Ubuntu 14.04配置安装java环境和android开发环境(包括真机调试环境)
    【转】[MTK软件原创] [SELinux] 如何设置确认selinux模式
    【转】Android中removeCallbacks失效原因
    【转】Android
    【转】Android开发20——单个监听器监听多个按钮点击事件
    【编程】辨异 —— proxy 与 delegate
  • 原文地址:https://www.cnblogs.com/liushoudong/p/12357040.html
Copyright © 2020-2023  润新知