进程与线程

  1 进程和线程是操作系统的基本概念，计算机是由硬件和软件组成。硬件中的CPU是计算机的核心，他承担计算机的所有任务。
  2 操作系统是运行在硬件上的软件，是计算机的管理者，他负责资源的管理和分配、任务的调度。
  3 程序是运行在系统上的具有某种功能的软件，比如浏览器、音乐、软件等。
  4 每次执行程序的时候，都会完成一定的功能，比如说浏览器帮我们打开网页，为了保证其独立性，就需要一个专门的管理和控制执行程序
  5 的数据结构--进程控制块。
  6 进程就是一个程序在一个数据集上的一次动态执行过程。进程一般由程序、数据集、进程控制块三部分组成。
  7 我们编写的程序用来描述进程要完成哪些功能以及如何完成；数据集则是程序在执行过程中所需要使用的资源；进程控制块用来记录进程的
  8 外部特征，描述进程的执行变化过程，系统可以利用他来控制和管理进程，他是系统感知进程存在的唯一标志。
  9 线程是操作系统能够进行运算调度的最小单位。他被包含在进程之中，是进程中的实际运作单位。
 10 一个线程指的是进程中一个单一顺序的控制流。
 11 一个进程中可以并发多个线程，每条线程并行执行不同的任务。
 12 
 13 1.进程是不活泼的，进程从来不执行任何东西，他只是线程的容器，若要是进程完成某种操作，他必须有一个在他环境中运行的线程，此线程负责
 14 执行包含在进程地址空间中的代码。
 15 2.创建一个进程时，操作系统会自动创建这个进程的第一个线程，称为主线程。此后该线程可以创建其他的线程。
 16 3.线程与进程的关系：线程是属于进程的，线程运行在进程空间内，同一进程所产生的线程共享同一内存空间，当进程退出时，该进程产生的
 17 线程都会被强制清楚并退出。线程可与同一进程的其他线程共享进程所拥有的全部资源，但是其本身基本上不拥有系统资源，只拥有一点在运行中必不可少的信息（如程序计数器、一组寄存器和栈）。
 18 
 19 线程与进程的区别：
 20 进程：对各种资源管理的集合;   
 21 线程：操作系统最小的调度单位，是一串指令的集合。
 22 
 23 进程中第一个线程是主线程，主线程创建其他线程，其他线程也可以创建线程，线程之间是平等的
 24 进程有父进程、子进程，独立的内存空间，唯一的进程标识符pid
 25 
 26 启动线程比启动进程快。运行进程和运行线程速度上是一样的，没有可比性。
 27 线程共享内存空间，进程的内存是独立的
 28 
 29 父进程生成子进程，相当于克隆一份内存空间。进程之间不能直接访问
 30 创建新线程很简单，创建新进程需要对其父进程进行一次克隆
 31 一个线程可以控制和操作同一进程里的其他线程，但是进程只能操作子进程
 32 
 33 同一个进程之间的线程之间可以直接通信
 34 两个进程想通信必须通过一个中间代理来实现
 35 
 36 进程的特性：
 37 动态性:进程的实质是程序的一次执行过程,进程是动态产生、动态消亡的
 38 并发性:任何进程都可以同其他进程一起并发执行
 39 独立性:进程是一个能独立运行的基本单位，同时也是系统分配资源和调度的独立单位
 40 异步性：每个进程都以相互独立、不可预知的速度向前推进
 41 
 42 进程由程序、数据和进程控制块三部分组成
 43 
 44 多任务的实现有3种方式：
 45 多进程模式
 46 多线程模式
 47 多进程+多线程模式
 48 
 49 
 50 # 直接调用
 51 import threading
 52 import time
 53 def run(n):
 54     # time.sleep(2)
 55     print('task', n)
 56 t1=threading.Thread(target=run,args=('t1',))
 57 t2=threading.Thread(target=run,args=('t2',))
 58 t1.start()
 59 t2.start()
 60 
 61 # task t1
 62 # task t2
 63 
 64 
 65 继承式调用
 66 import threading
 67 class MyThread(threading.Thread):
 68     def __init__(self,n):
 69         super(MyThread,self).__init__()
 70         self.n=n
 71     def run(self):
 72         print('这种方式函数名必须是run，写死的',self.n)
 73 t1=MyThread('t1')
 74 t2=MyThread('t2')
 75 t1.start()
 76 t2.start()
 77 
 78 # 这种方式函数名必须是run，写死的 t1
 79 # 这种方式函数名必须是run，写死的 t2
 80 
 81 
 82 
 83 使用传统编程看执行任务花费的时间
 84 import time
 85 import threading
 86 
 87 def task1():
 88     time.sleep(5)
 89     print('任务一完成',time.ctime())
 90 def task2():
 91     time.sleep(5)
 92     print('任务二完成',time.ctime())
 93 print('执行任务前打印当前时间',time.ctime())
 94 task1()
 95 task2()
 96 print('执行结束，记录结束时间',time.ctime())
 97 
 98 # 执行任务前打印当前时间 Sun Mar 10 08:21:49 2019
 99 # 任务一完成 Sun Mar 10 08:21:54 2019
100 # 任务二完成 Sun Mar 10 08:21:59 2019
101 # 执行结束，记录结束时间 Sun Mar 10 08:21:59 2019
102 
103 
104 
105 采用多线程实行并行处理，查看执行同样任务所花费的时间
106 import time
107 import threading
108 
109 def task1():
110     time.sleep(3)
111     print('任务一完成',time.ctime())
112 def task2():
113     time.sleep(3)
114     print('任务二完成',time.ctime())
115 print('执行任务前打印当前时间',time.ctime())
116 t1=threading.Thread(target=task1)
117 t2=threading.Thread(target=task2)
118 t1.start()
119 t2.start()
120 t1.join()
121 t2.join()
122 print('执行结束，打印当前时间')
123 
124 # 执行任务前打印当前时间 Sun Mar 10 08:28:44 2019
125 # 任务二完成 任务一完成 Sun Mar 10 08:28:47 2019
126 # Sun Mar 10 08:28:47 2019
127 # 执行结束，打印当前时间
128 
129 
130 
131 任何进程默认就会启动一个线程，我们把该线程称为主线程，主线程又可以启动新的线程，Python的threading模块有个current_thread()函数，
132 他永远返回当前线程的实例。主线程实例的名字叫MainThread,子线程的名字在创建线程实例时指定，这里用hello命名子线程。
133 名字仅仅用来在打印时显示，完全没有其他意义，如果不起名字，python就会自动给线程命名为Thread-1,Thread-2...
134 import time
135 import threading
136 
137 def loop():
138     print('线程%s正在执行。。。'%threading.current_thread().name)
139     n=0
140     while n<5:
141         n+=1
142         print('线程%s>>>%s'%(threading.current_thread().name,n))
143         time.sleep(2)
144     print('线程%s结束'%threading.current_thread().name)
145 print('主线程%s正在执行。。。'%threading.current_thread().name)
146 # 给线程起名hello
147 t=threading.Thread(target=loop,name='hello')
148 t.start()
149 t.join()
150 print('主线程%s结束'%threading.current_thread().name)
151 
152 # 主线程MainThread正在执行。。。
153 # 线程hello正在执行。。。
154 # 线程hello>>>1
155 # 线程hello>>>2
156 # 线程hello>>>3
157 # 线程hello>>>4
158 # 线程hello>>>5
159 # 线程hello结束
160 # 主线程MainThread结束
161 
162 
163 
164 当我们使用setDaemon(True)方法时，设置子线程为守护线程时，主线程一旦执行结束，则全部线程全部被终止执行，
165 可能出现的情况就是，子线程的任务还没有完全结束，就被迫停止。不设置的话默认为setDaemon(False),
166 如果你设置一个线程为守护线程，就表示这个线程是不重要的，在进程退出的时候，不用等待这个线程退出。
167 import time
168 import threading
169 
170 def task():
171     print('start fun',time.ctime())
172     time.sleep(2)
173     print('end fun',time.ctime())
174 t1=threading.Thread(target=task)
175 print('线程名字',t1.getName(),time.ctime()) #显示实例线程名字
176 t1.setDaemon(True)  # 设置t1为守护线程
177 t1.start()
178 time.sleep(1)
179 print(threading.current_thread().name,time.ctime()) # 主线程执行结束，则全部终止
180 
181 # 线程名字 Thread-1 Sun Mar 10 16:58:14 2019
182 # start fun Sun Mar 10 16:58:14 2019
183 # MainThread Sun Mar 10 16:58:15 2019
184 
185 
186 
187 
188 多进程与多线程最大的不同在于，多进程中，同一个变量，各自有一份拷贝在每个进程中，互不影响，而多线程中，所有变量都由所有线程
189 共享。所以任何一个变量都可以被任何一个线程修改，因此，线程之间共享数据最大的危险在于多个线程同时修改一个变量，把内容给改乱了。
190 
191 
192 
193 # 没有锁
194 import threading
195 
196 balance=0
197 def task(n):
198     global balance
199     balance+=n
200     balance-=n
201 def task2(arg,n):
202     while arg>0:
203         # lock.acquire()#获得锁
204         task(n)
205         arg-=1
206 
207 t1=threading.Thread(target=task2,args=(880000,5))
208 t2=threading.Thread(target=task2,args=(970000,6))
209 t1.start()
210 t2.start()
211 t1.join()
212 t2.join()
213 print(balance)
214 
215 
216 
217 import threading
218 
219 balance=0
220 lock=threading.RLock()
221 def task(n):
222     global balance
223     balance+=n
224     balance-=n
225 def task2(arg,n):
226     while arg>0:
227         lock.acquire()#获得锁
228         try:
229             task(n)
230         finally:
231             lock.release()
232         arg-=1
233 
234 t1=threading.Thread(target=task2,args=(1880000,5))
235 t2=threading.Thread(target=task2,args=(1770000,6))
236 t1.start()
237 t2.start()
238 t1.join()
239 t2.join()
240 print(balance)
241 
242 当多个线程同时执行lock.acquire()时，只有一个线程能够成功地获取锁，然后执行代码，其他线程就继续等待直到获得锁为止。
243 获得锁的线程用完后一定要释放锁，否则那些苦苦等待的线程将永远等待下去，称为死线程。
244 锁的好处就是确保了某段代码只能由一个线程从头到尾的执行。坏处当然也很多，首先是阻止了多线程并发执行，包含锁的某段代码实际上
245 只能以单线程的模式执行，效率就大大下降了。其次，由于可以存在多个锁，不同的线程持有不同的锁，并试图获取对方持有的锁时，可能会
246 造成死锁，导致多个线程全部挂起，既不能执行，也无法结束，只能靠操作系统强制终止。
247 
248 在python中，不能利用多线程实现多核任务，但可以通过多进程实现多核任务。
249 
250 import time
251 import threading
252 
253 globals_num=0
254 #lock=threading.RLock()
255 def Func():
256     # lock.acquire()
257     global globals_num
258     globals_num+=1
259     time.sleep(1)
260     print(globals_num)
261     # lock.release()
262 for i in range(10):
263     t=threading.Thread(target=Func)
264     t.start()
265 
266 # 1010
267 #
268 # 10
269 # 101010
270 #
271 # 10
272 #
273 # 101010
274 
275 
276 
277 import time
278 import threading
279 
280 globals_num=0
281 lock=threading.RLock()
282 def Func():
283     lock.acquire()
284     global globals_num
285     globals_num+=1
286     time.sleep(1)
287     print(globals_num)
288     lock.release()
289 for i in range(10):
290     t=threading.Thread(target=Func)
291     t.start()
292 
293 # 1
294 # 2
295 # 3
296 # 4
297 # 5
298 # 6
299 # 7
300 # 8
301 # 9
302 # 10
303 
304 在多线程环境下，每个线程都有自己的数据。一个线程使用自己的局部变量比使用全局变量好，因为局部变量只有线程自己能看见，不会影响
305 其他线程，而全局变量的修改必须加锁。
306 
307 ThreadLocal，很多地方叫做线程本地变量，也有些地方叫做线程本地存储，其实意思差不多。ThreadLocal为变量在每个线程中都创建了
308 一个副本，那么每个线程可以访问自己内部的副本变量。
309 ThreadLocal最常用的地方就是为每个线程绑定一个数据库连接，HTTP请求，用户身份信息等，这样一个线程的所有调用到的处理函数都可以
310 非常方便地访问这些资源。
311 
312 在新的线程中local_data并没有x属性，并且在新线程中的赋值并不会影响到其他线程。
313 去掉local_data=Widgt()的注释，local_data就变成了线程共享的变量。
314 
315 import threading
316 
317 class Widgt(object):
318     pass
319 
320 def test():
321     local_data=threading.local()
322     # local_data=Widgt()
323     local_data.x=1
324     def thread_func():
325         print('Has x in new thread:%s'%hasattr(local_data,'x'))#hasattr(实例名，属性名)判断对象中是否存在该属性
326         local_data.x=2
327         print('Has x in new thread:%s'%hasattr(local_data,'x'))
328         print('x in pre thread is %s'%local_data.x)
329     t=threading.Thread(target=thread_func)
330     t.start()
331     t.join()
332     print('x in pre thread is %s'%local_data.x)
333 if __name__=='__main__':
334     test()
335 
336 # Has x in new thread:False
337 # Has x in new thread:True
338 # x in pre thread is 2
339 # x in pre thread is 1
340 
341 
342 
343 
344 创建全局ThreadLocal对象
345 import threading
346 
347 local_school=threading.local()
348 def process_student():
349     #获取当前线程关联的student
350     std=local_school.student
351     print('Hello,%s(in %s)'%(std,threading.current_thread().name))
352 def process_thread(name):
353     #绑定ThreadLocal的student
354     local_school.student=name
355     process_student()
356 t1=threading.Thread(target=process_thread,args=('Alice',),name='Thread-A')
357 t2=threading.Thread(target=process_thread,args=('Bob',),name='Thread-B')
358 t1.start()
359 t2.start()
360 t1.join()
361 t2.join()
362 
363 # Hello,Alice(in Thread-A)
364 # Hello,Bob(in Thread-B)
365 
366 
367 
368 multiprocessing模块提供了一个Process类来代表一个进程对象，multiprocessing模块就是跨平台版本的多进程模块。
369 import time
370 import multiprocessing
371 
372 def add(number,value,lock):
373     lock.acquire()
374     try:
375         print('init add{0} number={1}'.format(value,number))
376         for i in range(1,6):
377             number+=value
378             time.sleep(1)
379             print('add{0} number={1}'.format(value,number))
380     except Exception as e:
381         raise e
382     finally:
383         lock.release()
384 if __name__=='__main__':
385     lock=multiprocessing.Lock()
386     number=0
387     p1=multiprocessing.Process(target=add,args=(number,1,lock))
388     p2=multiprocessing.Process(target=add,args=(number,3,lock))
389     p1.start()
390     p2.start()
391     print('main end')
392 
393 # main end
394 # init add1 number=0
395 # add1 number=1
396 # add1 number=2
397 # add1 number=3
398 # add1 number=4
399 # add1 number=5
400 # init add3 number=0
401 # add3 number=3
402 # add3 number=6
403 # add3 number=9
404 # add3 number=12
405 # add3 number=15
406 
407 
408 
409 
410 下面的例子演示了启动一个子进程并等待其结束
411 
412 from multiprocessing import Process
413 import os
414 
415 #子进程要执行的代码
416 def run_proc(name):
417     print('Run child process %s(%s)...'%(name,os.getpid()))
418 if __name__=='__main__':
419     print('Parent process %s.'%(os.getpid()))
420     p=Process(target=run_proc,args=('test',))
421     print('Child process will start.')
422     p.start()  #启动进程
423     p.join() #等待子进程执行结束后再往下执行，通常用于进程间的同步
424     print('Child process end.')
425 
426 # Parent process 6012.
427 # Child process will start.
428 # Run child process test(8724)...
429 # Child process end.
430 
431 
432 
433 
434 
435 Pool类 在使用Python进行系统管理时，特别是同时操作多个文件目录或者远程控制多台主机，并行操作可以节约大量的时间。如果操作
436 对象数目不大时，还可以直接使用Process类动态的生成多个进程，十几个还好，但是如果上百个甚至更多，那手动去限制进程数量就显得
437 特别的繁琐，此时进程池就派上用场了。
438 Pool类可以提供指定数量的进程共用户调用，当有新的需求提交到Pool中时，如果池还没满，就会创建一个新的进程来执行请求。如果
439 池满，请求就会告知等待，直到池中有进程结束，才会创建新的进程来执行这些请求。
440 下面介绍multiprocessing模块下的Pool类下的几个方法
441 apply()
442 apply(func[,args=()[,kwds={}]])该函数用于传递不定参数，主进程会被阻塞直到函数执行结束(不建议使用，并且3.x以后不再出现)
443 apply_async()
444 apply_async(func[,args=()[,kwds={}[,callback=None]]])与apply用法一样，但他是非阻塞且支持结果返回进行回调
445 map()
446 map(func,iterable[,chunksize=None]) Pool类中的map方法，与内置的map函数用法基本一致，它会使进程阻塞直到返回结果。注意：第二个
447 参数虽然是迭代器，但在实际使用中，必须在整个队列都就绪后，程序才会运行子进程。
448 close()关闭进程池，使其不再接收新的任务。
449 terminate()结束工作进程，不再处理未完成的任务。
450 join()主进程阻塞等待子进程的退出，join方法必须在close或terminate之后使用。
451 
452 from multiprocessing import Pool
453 import os
454 import time
455 import random
456 
457 def long_time_task(name):
458     print('运行任务%s(%s)...'%(name,os.getpid()),time.ctime()) # os.getpid()获得当前进程的进程号
459     start=time.time()
460     time.sleep(random.random()*3)
461     end=time.time()
462     print('任务%s运行%0.2f秒'%(name,(end-start)),time.ctime())
463 if __name__=='__main__':
464     print('Parent process %s.'%os.getpid(),time.ctime())
465     p=Pool(4) # 创建拥有4个进程数量的进程池
466     for i in range(5):
467         p.apply_async(long_time_task,args=(i,))
468     print('Waiting for all subprocesses done...',time.ctime())
469     p.close()
470     p.join()
471     print('All subprocesses done.',time.ctime())
472 
473 # Parent process 9704. Mon Mar 11 16:14:31 2019
474 # Waiting for all subprocesses done... Mon Mar 11 16:14:31 2019
475 # 运行任务0(5832)... Mon Mar 11 16:14:31 2019
476 # 运行任务1(7004)... Mon Mar 11 16:14:31 2019
477 # 运行任务2(6956)... Mon Mar 11 16:14:31 2019
478 # 运行任务3(1632)... Mon Mar 11 16:14:31 2019
479 # 任务3运行1.00秒 Mon Mar 11 16:14:32 2019
480 # 运行任务4(1632)... Mon Mar 11 16:14:32 2019
481 # 任务1运行1.98秒 Mon Mar 11 16:14:33 2019
482 # 任务2运行2.36秒 Mon Mar 11 16:14:34 2019
483 # 任务0运行2.98秒 Mon Mar 11 16:14:34 2019
484 # 任务4运行2.01秒 Mon Mar 11 16:14:34 2019
485 # All subprocesses done. Mon Mar 11 16:14:35 2019
486 
487 
488 
489 
490 Process之间肯定是需要通信的，操作系统提供了很多机制来实现进程间的通信。Python的multiprocessing模块包装了底层的机制，
491 提供了Queue,Pipes等多种方式来交换数据。我们以Queue为例，在父进程中创建两个子进程，一个往Queue里写数据，另一个从Queue里
492 读取数据。
493 
494 from multiprocessing import Process,Queue
495 import os
496 import time
497 import random
498 
499 # 写数据进程执行的代码
500 def write(q):
501     print('Process to write:%s'%os.getpid())
502     for value in ['A','B','C']:
503         print('Put %s to queue...'%value)
504         q.put(value)
505         time.sleep(random.random())
506 # 读数据进程执行的代码
507 def read(q):
508     print('Process to read:%s'%os.getpid())
509     while True:
510         value=q.get(True)
511         print('Get %s from queue.'%value)
512 if __name__=='__main__':
513     q=Queue() # 父进程创建Queue，并创建给子进程
514     w=Process(target=write,args=(q,))
515     r=Process(target=read,args=(q,))
516     w.start() # 启动子进程pw，写入
517     r.start() # 启动子进程pr,读取
518     w.join()
519     r.terminate() # r
520 
521 # Process to write:9048
522 # Put A to queue...
523 # Process to read:1580
524 # Get A from queue.
525 # Put B to queue...
526 # Get B from queue.
527 # Put C to queue...
528 # Get C from queue.
529 
530 
531 
532 
533 Queue是python标准库中的线程安全的队列（FIFO）实现，提供了一个适用于多线程编程的先进先出的数据结构，即队列，用来在生产者
534 和消费者线程之间的信息传递。基本FIFO队列 class queue.Queue(maxsize=0)
535 FIFO即First in First out ，先进先出。Queue提供了一个基本的FIFO容器，使用方法很简单，maxsize是一个整数，指明了队列中能存放的
536 数据个数的上限。一旦达到了上限，插入会导致阻塞，直到队列中数据被消费掉。如果maxsize小于或者等于0，队列大小没有限制。
537 
538 import queue
539 
540 q=queue.Queue()
541 for i in range(5):
542     q.put(i)
543 while not q.empty():
544     print(q.get())
545 
546 # 0
547 # 1
548 # 2
549 # 3
550 # 4
551 
552 
553 
554 LIFO队列 即 last in first out ，后进先出
555 class queue.LifoQueue(maxsize=0)
556 
557 import queue
558 
559 q=queue.LifoQueue(maxsize=0)
560 for i in range(5):
561     q.put(i)
562 while not q.empty():
563     print(q.get())
564 
565 # 4
566 # 3
567 # 2
568 # 1
569 # 0
相关阅读:
POJ 2018 二分
 873. Length of Longest Fibonacci Subsequence
847. Shortest Path Visiting All Nodes
838. Push Dominoes
813. Largest Sum of Averages
801. Minimum Swaps To Make Sequences Increasing
790. Domino and Tromino Tiling
764. Largest Plus Sign
Weekly Contest 128
746. Min Cost Climbing Stairs
原文地址：https://www.cnblogs.com/zpdbkshangshanluoshuo/p/10513105.html